CN114047823A

CN114047823A - Three-dimensional model display method, computer-readable storage medium and electronic device

Info

Publication number: CN114047823A
Application number: CN202111423855.1A
Authority: CN
Inventors: 李佳佳
Original assignee: Beijing Fangjianghu Technology Co Ltd
Current assignee: Seashell Housing Beijing Technology Co Ltd
Priority date: 2021-11-26
Filing date: 2021-11-26
Publication date: 2022-02-15

Abstract

The embodiment of the disclosure discloses a three-dimensional model display method, a computer-readable storage medium and an electronic device. The method comprises the following steps: determining a visual angle to be displayed of the target three-dimensional model; acquiring a first model image of a target three-dimensional model under a reference visual angle; determining first visual angle difference information of a visual angle to be displayed and a reference visual angle; generating a second model image of the target three-dimensional model under the to-be-displayed visual angle through an image generation network according to the first model image and the first visual angle difference information; and displaying the second model image. The embodiment of the disclosure can reduce the amount of image data which needs to be stored in advance, thereby saving storage space.

Description

Three-dimensional model display method, computer-readable storage medium and electronic device

Technical Field

The present disclosure relates to the field of three-dimensional modeling and display technologies, and in particular, to a three-dimensional model display method, a computer-readable storage medium, and an electronic device.

Background

Virtual Reality (VR) technology can borrow devices, so that a user experiences the combination of a Virtual world and a real world, and the user is helped to achieve immersive user experience.

It should be noted that, in the VR technology, a three-dimensional model (for example, a three-dimensional house model) often needs to be displayed at different viewing angles, and based on this, multiple model images of the three-dimensional model at multiple viewing angles may be stored in advance, so that due postures of the three-dimensional model at different viewing angles are presented through different model images, and thus, the amount of image data that needs to be stored in advance is very large, and therefore, how to reduce the amount of image data that needs to be stored in advance is an urgent problem to be solved for those skilled in the art.

Disclosure of Invention

The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides a three-dimensional model display method, a computer-readable storage medium and an electronic device.

According to an aspect of the embodiments of the present disclosure, there is provided a three-dimensional model display method, including:

determining a visual angle to be displayed of the target three-dimensional model;

acquiring a first model image of the target three-dimensional model under a reference visual angle;

determining first visual angle difference information of the visual angle to be displayed and the reference visual angle;

generating a second model image of the target three-dimensional model under the to-be-displayed visual angle through an image generation network according to the first model image and the first visual angle difference information;

and displaying the second model image.

In one optional example, the image generation network comprises: an encoder and a decoder;

the generating, according to the first model image and the first perspective difference information, a second model image of the target three-dimensional model at the perspective to be displayed via an image generation network includes:

converting the first view difference information into a direction vector;

performing feature extraction on the first model image via the encoder to obtain an image feature vector;

splicing the direction vector and the image characteristic vector to obtain a spliced vector;

and generating a second model image of the target three-dimensional model under the view angle to be shown according to the splicing vector through the decoder.

In one optional example, the first view difference information includes: a horizontal angle difference value and a pitch angle difference value;

the converting the first view difference information into a direction vector includes:

performing a first trigonometric function operation on the horizontal angle difference value to obtain a first operation result value, and performing a second trigonometric function operation on the horizontal angle difference value to obtain a second operation result value;

performing the first trigonometric function operation on the pitch angle difference value to obtain a third operation result value, and performing the second trigonometric function operation on the pitch angle difference value to obtain a fourth operation result value;

generating a vector element comprising a direction vector of the first operation result value, the second operation result value, the third operation result value, and the fourth operation result value.

In one optional example, the method further comprises:

acquiring network training data; wherein the network training data comprises: training a third model image of the three-dimensional model under a first visual angle and a fourth model image under a second visual angle;

determining second perspective difference information of the second perspective and the first perspective;

and training the image generation network according to the third model image, the second visual angle difference information and the fourth model image.

In an optional example, the training the image generation network according to the third model image, the second perspective difference information, and the fourth model image includes:

generating a fifth model image via the image generation network according to the third model image and the second perspective difference information;

calculating a similarity evaluation value of the fifth model image and the fourth model image;

judging whether the fifth model image belongs to a real image or not by using a judging network to obtain a judging result value;

performing weighting operation according to the similarity evaluation value and the discrimination result value to obtain a network loss value;

and adjusting the network parameters of the image generation network by taking the minimized network loss value as an adjustment target.

In an optional example, the performing, according to the similarity evaluation value and the discrimination result value, a weighting operation to obtain a network loss value includes:

calculating the difference value between a preset numerical value and the judgment result value;

carrying out logarithmic operation on the difference value to obtain a logarithmic operation result value;

and carrying out weighting operation on the similarity evaluation value and the logarithm operation result value by using assigned weight so as to obtain a network loss value.

According to another aspect of the embodiments of the present disclosure, there is provided a three-dimensional model display apparatus including:

the first determination module is used for determining a to-be-displayed visual angle of the target three-dimensional model;

the first acquisition module is used for acquiring a first model image of the target three-dimensional model under a reference visual angle;

the second determining module is used for determining the first visual angle difference information of the visual angle to be displayed and the reference visual angle;

the generating module is used for generating a second model image of the target three-dimensional model under the to-be-displayed visual angle through an image generating network according to the first model image and the first visual angle difference information;

and the display module is used for displaying the second model image.

the generation module comprises:

a conversion submodule, configured to convert the first view difference information into a direction vector;

the characteristic extraction submodule is used for extracting the characteristics of the first model image through the encoder to obtain an image characteristic vector;

the splicing submodule is used for splicing the direction vector and the image characteristic vector to obtain a spliced vector;

and the first generation submodule is used for generating a second model image of the target three-dimensional model under the visual angle to be displayed according to the splicing vector through the decoder.

the conversion submodule comprises:

the first processing unit is used for performing first trigonometric function operation on the horizontal angle difference value to obtain a first operation result value and performing second trigonometric function operation on the horizontal angle difference value to obtain a second operation result value;

the second processing unit is used for performing the first trigonometric function operation on the pitch angle difference value to obtain a third operation result value and performing the second trigonometric function operation on the pitch angle difference value to obtain a fourth operation result value;

a generating unit configured to generate a direction vector whose vector elements include the first operation result value, the second operation result value, the third operation result value, and the fourth operation result value.

In one optional example, the apparatus further comprises:

the second acquisition module is used for acquiring network training data; wherein the network training data comprises: training a third model image of the three-dimensional model under a first visual angle and a fourth model image under a second visual angle;

a third determining module, configured to determine second view difference information between the second view and the first view;

and the training module is used for training the image generation network according to the third model image, the second visual angle difference information and the fourth model image.

In one optional example, the training module comprises:

a second generation submodule, configured to generate a fifth model image via the image generation network according to the third model image and the second perspective difference information;

a calculation sub-module for calculating a similarity evaluation value of the fifth model image and the fourth model image;

the obtaining submodule is used for judging whether the fifth model image belongs to a real image by utilizing a judging network so as to obtain a judging result value;

the processing submodule is used for carrying out weighting operation according to the similarity evaluation value and the discrimination result value so as to obtain a network loss value;

and the adjusting submodule is used for adjusting the network parameters of the image generation network by taking the minimized network loss value as an adjusting target.

In an optional example, the processing submodule includes:

the calculating unit is used for calculating the difference value between a preset numerical value and the judgment result value;

the third processing unit is used for carrying out logarithmic operation on the difference value to obtain a logarithmic operation result value;

and the fourth processing unit is used for performing weighted operation on the similarity evaluation value and the logarithm operation result value by using a designated weight so as to obtain a network loss value.

According to still another aspect of an embodiment of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above-described three-dimensional model exhibition method.

According to still another aspect of an embodiment of the present disclosure, there is provided an electronic device including:

a processor;

a memory for storing the processor-executable instructions;

and the processor is used for reading the executable instructions from the memory and executing the instructions to realize the three-dimensional model display method.

According to still another aspect of an embodiment of the present disclosure, there is provided a computer program product including a program which, when executed by a processor, implements the above-described three-dimensional model exhibition method.

In the embodiment of the disclosure, after the to-be-displayed visual angle of the target three-dimensional model is determined, a first model image of the target three-dimensional model under a reference visual angle can be acquired, first visual angle difference information between the to-be-displayed visual angle and the reference visual angle is determined, and a second model image of the target three-dimensional model under the to-be-displayed visual angle is generated through an image generation network according to the first model image and the first visual angle difference information so as to display the second model image, so that a due posture of the target three-dimensional model under the to-be-displayed visual angle is presented to a user. Therefore, in the embodiment of the disclosure, the model image of the three-dimensional model at the reference viewing angle is stored in advance, and then the determination of the viewing angle difference information and the use of the image generation network are combined, so that the model image at any viewing angle required by the user can be generated, that is, on the premise of only storing a small number of model images, the embodiment of the disclosure can present the due posture of the three-dimensional model at various viewing angles to the user, and therefore, the embodiment of the disclosure can reduce the image data amount required to be stored in advance, and thus can save the storage space.

The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a schematic flow chart of a three-dimensional model display method provided by an embodiment of the present disclosure.

FIG. 2 is a schematic illustration of a three-dimensional model of an object in an embodiment of the disclosure.

Fig. 3 is another flow chart diagram of a three-dimensional model display method provided by an embodiment of the disclosure.

Fig. 4 is a schematic diagram of an image generation network in an embodiment of the present disclosure.

Fig. 5 is a schematic flow chart of a three-dimensional model displaying method according to an embodiment of the disclosure.

Fig. 6 is a schematic flow chart of a three-dimensional model displaying method provided by an embodiment of the present disclosure.

Fig. 7 is a schematic diagram of a discrimination network in an embodiment of the present disclosure.

Fig. 8 is a schematic structural diagram of a three-dimensional model display device provided in an embodiment of the present disclosure.

Fig. 9 is another structural schematic diagram of a three-dimensional model display device provided in the embodiment of the disclosure.

Fig. 10 is a block diagram of an electronic device provided by an embodiment of the disclosure.

Detailed Description

Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.

It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.

It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.

It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.

In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

Exemplary method

Fig. 1 is a schematic flow chart of a three-dimensional model displaying method according to an exemplary embodiment of the present disclosure. The method shown in fig. 1 may include step 101, step 102, step 103, step 104 and step 105, which are described below separately.

Step 101, determining a to-be-displayed visual angle of a target three-dimensional model.

Alternatively, the target three-dimensional model may be a three-dimensional automobile model shown in fig. 2, although the type of the target three-dimensional model is not limited thereto, and for example, the target three-dimensional model may also be a three-dimensional house model or other types of three-dimensional models, which are not listed here.

Optionally, in a display scene of the target three-dimensional model, the to-be-displayed view angle of the target three-dimensional model may be determined based on an operation of a user on the VR device, for example, the user may input a view angle on the VR device through a touch operation, and the view angle may be used as the to-be-displayed view angle of the target three-dimensional model.

Optionally, a view to be presented of the three-dimensional model of the objectThe angle may include two portions, a horizontal angle (for ease of distinction, it will be referred to hereinafter as the first horizontal angle) and a pitch angle (for ease of distinction, it will be referred to hereinafter as the first pitch angle); wherein the first horizontal angle may be represented as θ_r1The first pitch angle may be expressed as θ_h1。

Step 102, a first model image of the target three-dimensional model under a reference view angle is obtained.

It should be noted that the reference view angle may be any preset view angle, and the reference view angle may include two parts, namely, a horizontal angle (for convenience of distinction, it is hereinafter referred to as a second horizontal angle) and a pitch angle (for convenience of distinction, it is hereinafter referred to as a second pitch angle); wherein the second horizontal angle may be represented as θ_r2The second pitch angle may be expressed as theta_h2. Alternatively, theta_r2May be 15 degrees, theta_h2May be 25 degrees; or, theta_r2May be 0 degrees, θ_h2May be 30 degrees.

In addition, a first model image of the target three-dimensional model at the reference viewing angle may be stored in advance, the first model image may be a real model image of the target three-dimensional model, and the first model image may specifically be an RGB (which represents colors of three channels of red, green, and blue) image. Thus, in step 102, the pre-stored first model image may be directly acquired.

Step 103, determining first view difference information between the view to be displayed and the reference view.

In step 103, a difference between the first horizontal angle and the second horizontal angle (which may be referred to as a horizontal angle difference hereinafter) and a difference between the first pitch angle and the second pitch angle (which may be referred to as a pitch angle difference hereinafter) may be calculated, and the calculated two differences may constitute first view angle difference information.

And 104, generating a second model image of the target three-dimensional model under the to-be-displayed visual angle through an image generation network according to the first model image and the first visual angle difference information.

It should be noted that the image generation network may be a network obtained by training in advance and used for generating the multi-view model image, and a specific training manner of the image generation network is described below and is not specifically developed here. Alternatively, the image generation network may be a codec network, which refers to a network including an encoder and a decoder.

The second model image is a model image generated by the image generation network, and the view angle to be displayed may be different from the reference view angle, and the second model image may be considered as a reconstructed image obtained by angle conversion based on the difference information between the first model image and the first view angle.

And 105, displaying the second model image.

In step 105, the second model image may be displayed, so that the due posture of the target three-dimensional model at the viewing angle to be displayed is presented to the user through the second model image.

based on the embodiment shown in fig. 1, as shown in fig. 3, step 104 includes:

step 1041, converting the first visual angle difference information into a direction vector;

step 1042, performing feature extraction on the first model image through an encoder to obtain an image feature vector;

step 1043, splicing the direction vector and the image feature vector to obtain a spliced vector;

and step 1044, generating a second model image of the target three-dimensional model under the view angle to be displayed according to the splicing vector through a decoder.

In an embodiment of the present disclosure, after determining the first view difference information, the first view difference information may be converted into a direction vector. In one embodiment, the first view difference information includes: a horizontal angle difference value and a pitch angle difference value;

converting the first view disparity information into a direction vector, comprising:

performing first trigonometric function operation on the pitch angle difference value to obtain a third operation result value, and performing second trigonometric function operation on the pitch angle difference value to obtain a fourth operation result value;

generating a vector of directions of vector elements including the first operation result value, the second operation result value, the third operation result value, and the fourth operation result value.

Alternatively, the first trigonometric function operation may be a sine function operation, and the second trigonometric function operation may be a cosine function operation, but of course, the first trigonometric function operation and the second trigonometric function operation may also be other types of trigonometric function operations, such as a tangent function operation, a secant function operation, and the like, which are not listed herein.

It is assumed that the first viewing angle difference information includes a horizontal angle difference value represented by Δ θ_rThe first view angle difference information includes a pitch angleThe difference is expressed as Δ θ_hIf the first trigonometric function operation is a sine function operation and the second trigonometric function operation is a cosine function operation, the result value of the first operation can be expressed as sin (Δ θ)_r) The second operation result value may be expressed as cos (Δ θ)_r) The result value of the third operation may be expressed as sin (Δ θ)_h) The fourth operation result value may be expressed as cos (Δ θ)_h) Then the direction vector may be in the form:

[sin(Δθ_r),cos(Δθ_r),sin(Δθ_h),cos(Δθ_h)]

it is easy to see that the direction vector here is specifically a 1 × 4 dimensional vector.

In this embodiment, based on trigonometric function operation, the conversion from the view angle difference information to the direction vector can be realized very conveniently and reliably.

In an embodiment of the present disclosure, feature extraction may be performed on the first model image by an encoder of the image generation network to obtain an image feature vector. Alternatively, as shown in fig. 4, the encoder (which corresponds to the encoding structure portion in fig. 4) may include 5 convolutional layers and 1 max-pooling layer; according to the sequence from front to back, the number of convolution kernels included in the 5 convolution layers can be 32, 64, 128, 64 and 16, the size of the convolution kernels adopted by the first 3 convolution layers can be 3 × 3, the size of the convolution kernels adopted by the other 2 convolution layers can be 1 × 1, each convolution layer can adopt a modified Linear Unit (ReLU) activation function, after a first model image is input, operation can be carried out at the convolution layers, and for each pixel point in the image, Linear combination can be carried out on different channels so as to reduce the dimension of the number of the channels on the basis of keeping the original planar structure; the local pooling range of the maximum pooling layer may be 2 x 2 and the step size may be 2. In addition, the convolution layer can be followed by a full link layer [4096,128] for dimensionality reduction to generate a 1 × 128 dimensional vector, which can be used as an image feature vector.

After the direction vector and the image feature vector are obtained, the direction vector and the image feature vector can be spliced to obtain a spliced vector. In a specific example, the direction vector is a 1 × 4-dimensional vector, and the image feature vector is a 1 × 128-dimensional vector, the dimension of the direction vector may be raised to be a 1 × 128-dimensional vector, and then the raised direction vector and elements at corresponding positions of the image feature vector are summed to obtain a stitched vector, or the 1 × 4-dimensional vector serving as the direction vector may be directly attached to the end of the image feature vector to obtain the stitched vector.

After the stitching vector is obtained, the stitching vector may be provided to a decoder of the image generation network (which corresponds to the decoding structure part in fig. 4), and the decoder may perform an operation based on the stitching vector provided thereto, thereby finally generating and outputting the second model image.

In the embodiment of the disclosure, an encoder in an image generation network can conveniently and reliably extract features of a first model image to generate an image feature vector, and a decoder in the image generation network can conveniently and reliably generate a second model image based on a splicing vector obtained by splicing the image feature vector and a direction vector.

On the basis of the embodiment shown in fig. 1, as shown in fig. 5, the method further includes:

step 111, acquiring network training data; wherein the network training data comprises: training a third model image of the three-dimensional model under a first visual angle and a fourth model image under a second visual angle;

step 112, determining second view difference information between the second view and the first view;

and 113, training an image generation network according to the third model image, the second visual angle difference information and the fourth model image.

It should be noted that the training three-dimensional model refers to a three-dimensional model used for training of an image generation network, and the training three-dimensional model may be derived from some open source databases, for example, from sharenet; where ShapeNet is a large dataset with a rich annotated three-dimensional model. It should be noted that the number of the trained three-dimensional models may be thousands or millions, and since the related processes for the respective trained three-dimensional models are similar, the embodiment of the present disclosure is described only with respect to the related process for a single trained three-dimensional model.

In step 111, the training three-dimensional model may be analyzed by an analysis program to obtain network training data, for example, at a fixed pitch angle and a certain horizontal angle each time, a corresponding model image (which may be a two-dimensional image) may be analyzed by the analysis program, and the network training data may be obtained by multiple times of analysis. Specifically, the pitch angle may be set to 0 degree, the horizontal viewing angle may be controlled from 0 degree to 360 degrees, a model image may be analyzed every 10 degrees, the pitch angle may be changed every 10 degrees from 0 degree to 30 degrees, and a plurality of model images may be obtained in this manner, based on the obtained plurality of model images, a plurality of network training data may be generated, each network training data may include two model images of the obtained plurality of model images, where one model image may be used as a third model image at a first viewing angle, and another model image may be used as a fourth model image at a second viewing angle. It should be noted that, although the number of the network training data is plural, since the related processing for each network training data is similar, the embodiment of the present disclosure is described only for the related processing of a single network training data.

In step 112, second view difference information for the second view and the first view may be determined in a manner similar to the manner in which the first view difference information is determined above.

In step 113, the second perspective difference information may be converted into a direction vector, and then the third model image and the converted direction vector may be used as input data, and the fourth model image may be used as output data to be trained, so as to obtain an image generation network, which is capable of generating and outputting a model image at a corresponding perspective according to the model image and the direction vector provided thereto. Of course, in step 113, the third model image and the second perspective difference information may be directly used as input data, and the fourth model image may be used as output data to be trained, so as to obtain an image generation network, and the image generation network may generate and output the model image at the corresponding perspective according to the model image and the perspective difference information provided thereto.

In the embodiment of the disclosure, after network training data including a third model image of a training three-dimensional model at a first view angle and a fourth model image at a second view angle is acquired, second view angle difference information of the second view angle and the first view angle can be determined, then an image generation network can be trained according to the third model image, the second view angle difference information and the fourth model image, and the image generation network obtained through training can be used for generating the second model image in the foregoing.

On the basis of the embodiment shown in fig. 5, as shown in fig. 6, step 113 includes:

step 1131, generating a fifth model image through an image generation network according to the third model image and the second perspective difference information;

step 1132, calculating a similarity evaluation value of the fifth model image and the fourth model image;

step 1133, judging whether the fifth model image belongs to a real image by using a judgment network to obtain a judgment result value;

step 1134, performing weighted operation according to the similarity evaluation value and the discrimination result value to obtain a network loss value;

step 1135, adjust the network parameters of the image generation network by using the minimum network loss value as an adjustment target.

The specific manner in which the image generation network in step 1131 generates the fifth model image is similar to the specific manner in which the image generation network in step 104 generates the second model image, and the only difference is that the image generation network in step 1131 is the image generation network in the training process, and the image generation network in step 104 is the image generation network after the training process, and the specific manner in which the fifth model image is generated is not described herein.

After the fifth model image is generated, a similarity evaluation value of the fifth model image and the fourth model image may be calculated. Alternatively, the euclidean distance value of both the fifth model image and the fourth model image may be calculated and used as the similarity evaluation value of both, and of course, the way of calculating the similarity evaluation value is not so selected, and for example, the cosine distance value of both the fifth model image and the fourth model image may be calculated and used as the similarity evaluation value.

In addition, the judgment network can be used for judging whether the fifth model image belongs to a real image (namely, the image is not reconstructed by the image generation network) to obtain a judgment result value; the discrimination network can comprise a discriminator, and the discrimination result value can be generated by the discriminator; the specific structure of the discrimination network can be seen in fig. 7, where the specific structure of the discrimination network is similar to that of the encoder above, except that the full connection layer of the discrimination network is assumed to be [4096,128,1 ]. Alternatively, in a case where the fifth model image belongs to a real image, the determination result may be true, and the determination result value may be 1, and otherwise, the determination result may be false, and the determination result value may be 0.

After the similarity evaluation value and the discrimination result value are obtained, a weighting operation may be performed according to the similarity evaluation value and the discrimination result value to obtain a network loss value, so as to adjust a network parameter of the image generation network with the minimized network loss value as an adjustment target.

In one embodiment, performing a weighting operation according to the similarity evaluation value and the discrimination result value to obtain a network loss value includes:

calculating the difference value between the preset value and the judgment result value;

and carrying out weighting operation on the similarity evaluation value and the logarithm operation result value by using the designated weight so as to obtain a network loss value.

Alternatively, the preset value may be 1.

Suppose that the third model image is represented as x and the fourth model image is represented as x

The discrimination network is denoted by D, the image generation network is denoted by G, the direction vector into which the second perspective difference information is converted is denoted by D, the image feature vector obtained by extracting the image feature of the third model image is denoted by z, the assigned weight is denoted by λ, and the similarity evaluation value is specifically the euclidean distance value between the fifth model image and the fourth model image, then the fifth model image may be denoted by G (z, D), and the similarity evaluation value may be denoted by D

The discrimination result value can be expressed as D [ G (z, D)]The difference between the predetermined value and the discrimination result value can be expressed as 1-D [ G (z, D)]The resulting value of the logarithmic operation can be expressed as log { 1-D [ G (z, D)]The network loss value can be expressed as

It should be noted that, with reference to the specific form of the network loss value in the previous paragraph, the network loss value is actually a numerical value obtained by performing weighted summation on the similarity evaluation value and the logarithm operation result value, and in the specific implementation, the network loss value may also be a numerical value obtained by performing weighted average on the similarity evaluation value and the logarithm operation result value by using a specified weight.

After the network loss value is obtained, the network loss value may be minimized as an adjustment target, and the network parameters of the image generation network are adjusted, so that the image generation network is trained to an optimal state, that is, the following loss function may be adopted in the embodiment of the present disclosure to train the image generation network:

wherein pdata represents an image angle joint probability distribution of the network training data.

It should be noted that, in the above loss function,

it can be considered as the Euclidean distance loss between the newly generated image (i.e., the output image of the image generation network) and the training image in the real database, log { 1-D [ G (z, D) ]]The Euclidean distance loss can be used for ensuring the similarity of a newly generated image and a training image in a real database, the countermeasure loss can be used for keeping the texture information of the image, and the lambda can be used for measuring the sharpening degree of the newly generated image and the similarity of the newly generated image and the training image, so that the model image generated by the image generation network is balanced between the similarity and the reality.

Therefore, in the embodiment of the disclosure, through obtaining the similarity evaluation value and the discrimination result value, and performing weighting operation according to the similarity evaluation value and the discrimination result value, the network parameters are adjusted based on the network loss value obtained through the operation, and the quality of the model image generated by the image generation network can be improved by integrating the countermeasure thought.

It should be noted that the initial database often includes a large number of three-dimensional models, and during network training, models with poor texture or low resolution in the three-dimensional models may be removed first, and then a certain number of three-dimensional models may be selected from the remaining three-dimensional models, for example, 6000 three-dimensional models may be selected, wherein 5800 three-dimensional models may be used as a training set, and the other 200 three-dimensional models may be used as a test set.

Optionally, when performing network training, an alternative training mode may be adopted, specifically, a discrimination network for discriminating whether an input image is a real image may be trained first, network parameters of the discrimination network may be fixed after iteration is performed for several times, then an image generation network is trained, the discrimination network is trained after iteration is performed for several times, and so on until the discrimination network and the image generation network are balanced.

Any of the three-dimensional model display methods provided by embodiments of the present disclosure may be performed by any suitable device having data processing capabilities, including but not limited to: terminal equipment, a server and the like. Alternatively, any three-dimensional model display method provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor may execute any three-dimensional model display method mentioned in the embodiments of the present disclosure by calling a corresponding instruction stored in a memory. And will not be described in detail below.

Exemplary devices

Fig. 8 is a schematic structural diagram of a three-dimensional model display apparatus according to an exemplary embodiment of the present disclosure, and the apparatus shown in fig. 8 includes a first determining module 801, a first obtaining module 802, a second determining module 803, a generating module 804, and a display module 805.

A first determining module 801, configured to determine a to-be-displayed view angle of a target three-dimensional model;

a first obtaining module 802, configured to obtain a first model image of the target three-dimensional model at a reference viewing angle;

a second determining module 803, configured to determine first view difference information between the to-be-displayed view and the reference view;

the generating module 804 is configured to generate, according to the first model image and the first perspective difference information, a second model image of the target three-dimensional model at the perspective to be displayed via an image generation network;

and a display module 805 for displaying the second model image.

as shown in fig. 9, the generating module 804 includes:

a conversion sub-module 8041, configured to convert the first view difference information into a direction vector;

a feature extraction submodule 8042, configured to perform feature extraction on the first model image through the encoder to obtain an image feature vector;

the splicing submodule 8043 is configured to splice the direction vector and the image feature vector to obtain a spliced vector;

the first generating sub-module 8044 is configured to generate, by the decoder, a second model image of the target three-dimensional model under the view angle to be displayed according to the stitching vector.

a conversion submodule, comprising:

the second processing unit is used for performing first trigonometric function operation on the pitching angle difference value to obtain a third operation result value and performing second trigonometric function operation on the pitching angle difference value to obtain a fourth operation result value;

a generating unit configured to generate a direction vector in which the vector element includes the first operation result value, the second operation result value, the third operation result value, and the fourth operation result value.

In an alternative example, as shown in fig. 9, the apparatus further comprises:

a second obtaining module 811, configured to obtain network training data; wherein the network training data comprises: training a third model image of the three-dimensional model under a first visual angle and a fourth model image under a second visual angle;

a third determining module 812, configured to determine second view difference information between the second view and the first view;

a training module 813, configured to train an image generation network according to the third model image, the second perspective difference information, and the fourth model image.

In an alternative example, training module 813 includes:

the second generation submodule is used for generating a fifth model image through an image generation network according to the third model image and the second visual angle difference information;

the calculating submodule is used for calculating a similarity evaluation value of the fifth model image and the fourth model image;

In one optional example, the processing submodule includes:

the calculating unit is used for calculating the difference value between the preset value and the judgment result value;

and the fourth processing unit is used for performing weighted operation on the similarity evaluation value and the logarithm operation result value by using the assigned weight so as to obtain a network loss value.

Exemplary electronic device

Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 10. The electronic device may be either or both of the first device and the second device, or a stand-alone device separate from them, which stand-alone device may communicate with the first device and the second device to receive the acquired input signals therefrom.

Fig. 10 illustrates a block diagram of an electronic device 1000 in accordance with an embodiment of the disclosure.

As shown in fig. 10, the electronic device 1000 includes one or more processors 1001 and memory 1002.

The processor 1001 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 1000 to perform desired functions.

Memory 1002 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer readable storage medium and executed by the processor 1001 to implement the three-dimensional model display methods of the various embodiments of the present disclosure described above. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.

In one example, the electronic device 1000 may further include: an input device 1003 and an output device 1004, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

For example, when the electronic device 1000 is a first device or a second device, the input means 1003 may be a microphone or a microphone array. When the electronic device 1000 is a stand-alone device, the input means 1003 may be a communication network connector for receiving the collected input signals from the first device and the second device.

The input device 1003 may include, for example, a keyboard, a mouse, or the like.

The output device 1004 can output various information to the outside. The output devices 1004 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the electronic device 1000 relevant to the present disclosure are shown in fig. 10, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 1000 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer-readable storage Medium

In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising a program that, when executed by a processor, implements the three-dimensional model display method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.

The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the three-dimensional model exhibition method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A three-dimensional model display method is characterized by comprising the following steps:

and displaying the second model image.

2. The method of claim 1, wherein the image generation network comprises: an encoder and a decoder;

converting the first view difference information into a direction vector;

3. The method of claim 2, wherein the first view difference information comprises: a horizontal angle difference value and a pitch angle difference value;

4. The method according to any one of claims 1-3, further comprising:

5. The method of claim 4, wherein training the image generation network based on the third model image, the second perspective difference information, and the fourth model image comprises:

6. The method according to claim 5, wherein the performing a weighting operation according to the similarity evaluation value and the discrimination result value to obtain a network loss value comprises:

7. A computer-readable storage medium, in which a computer program is stored, the computer program being configured to execute the three-dimensional model displaying method according to any one of claims 1 to 6.

8. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is used for reading the executable instructions from the memory and executing the instructions to realize the three-dimensional model display method of any one of the claims 1 to 6.

9. A computer program product comprising a program, characterized in that the program realizes the three-dimensional model presentation method of any one of claims 1 to 6 when executed by a processor.