CN112785712A

CN112785712A - Three-dimensional model generation method and device and electronic equipment

Info

Publication number: CN112785712A
Application number: CN202110099561.1A
Authority: CN
Inventors: 张俊
Original assignee: New Oriental Education Technology Group Co ltd
Current assignee: New Oriental Education Technology Group Co ltd
Priority date: 2021-01-25
Filing date: 2021-01-25
Publication date: 2021-05-11
Anticipated expiration: 2041-01-25
Also published as: CN112785712B

Abstract

The embodiment of the application discloses a method and a device for generating a three-dimensional model and electronic equipment, which can meet the personalized requirements of users and improve the user experience. The method for generating the three-dimensional model comprises the following steps: acquiring a two-dimensional image of a target object according to user input information; performing feature extraction on the two-dimensional image to obtain first structured data of the two-dimensional image, wherein the first structured data comprises a skeleton line of the target object; matching in a candidate three-dimensional model library according to the first structured data to obtain an initial three-dimensional model; and carrying out deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object.

Description

Three-dimensional model generation method and device and electronic equipment

Technical Field

The present application relates to the field of computer vision technology, and more particularly, to a method and an apparatus for generating a three-dimensional model, and an electronic device.

Background

In immersive teaching application scenarios, a large number of animals or other types of three-dimensional models are required for the presentation to improve the vividness of the teaching scenario and the learning experience of the students. The three-dimensional model may include different types of model data, such as those existing in nature and those of a cartoon type.

At present, for the requirement of the three-dimensional model in the teaching application scene, one mode is to perform retrieval query based on a large-scale model library, but the mode is difficult to meet the personalized requirements of different users, and the other mode is to perform three-dimensional model modeling by adopting professional modeling software, but the mode has higher requirements on the professional ability of the users, and the user interaction mode is not friendly, and the modeling efficiency is low.

Therefore, how to create a method for generating a three-dimensional model capable of meeting the personalized requirements of users is a technical problem to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a method and a device for generating a three-dimensional model and electronic equipment, which can meet the personalized requirements of users and improve the user experience.

In a first aspect, a method for generating a three-dimensional model is provided, including: acquiring a two-dimensional image of a target object according to user input information; performing feature extraction on the two-dimensional image to obtain first structured data of the two-dimensional image, wherein the first structured data comprises a skeleton line of the target object; matching in a candidate three-dimensional model library according to the first structured data to obtain an initial three-dimensional model; and carrying out deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object.

According to the technical scheme of the embodiment of the application, the method for generating the target three-dimensional model of the corresponding target object according to the user input information is provided, and the individual requirements of the user can be met. In addition, in the embodiment of the application, unified first structured data of the target object can be obtained according to multi-modal input information of the user, the first structured data includes a skeleton line of the target object, and can reflect the structural features of the skeleton of the target object more accurately, and further can be used for representing the structural features of the joint level in the skeleton, so that the initial three-dimensional model obtained through matching of the first structured data has the same structural features as the target object in the user input information and has a higher matching degree with the user input information. Furthermore, the target three-dimensional model is obtained by performing deformation processing on the initial three-dimensional model, the target three-dimensional model has more personalized features and higher matching degree with the input information of the user, and the attractiveness of the three-dimensional model can be further improved, so that the use experience of the user is improved.

With reference to the first aspect, in a first implementation manner of the first aspect, the matching, according to the first structured data, the obtaining an initial three-dimensional model in a candidate three-dimensional model library includes: acquiring second structured data corresponding to the candidate three-dimensional models in the candidate three-dimensional model library; and matching to obtain the initial three-dimensional model according to the optimization result of the energy equation corresponding to the first structured data and the second structured data.

Based on the technical scheme of the embodiment of the application, the candidate three-dimensional model in the candidate three-dimensional model base is subjected to matching reconstruction according to the first structured data determined by the user input information to obtain the initial three-dimensional model, so that the structured information of the initial three-dimensional model is matched with the user input information.

With reference to the first implementation manner of the first aspect, in a second implementation manner of the first aspect, the obtaining second structured data corresponding to the candidate three-dimensional models in the candidate three-dimensional model library includes: obtaining the structured data corresponding to each of the plurality of blocks to obtain the second structured data, wherein the structured data corresponding to each block includes at least one of the following parameters: position parameters, rotation parameters, shape parameters, and pose parameters.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a third implementation manner of the first aspect, the candidate three-dimensional models in the candidate three-dimensional model library include: a topology of a joint level, the topology of the joint level including skeleton lines.

In the embodiment of the application, the change processing of the three-dimensional model on the joint level can be supported through the topological structure of the joint level of the candidate three-dimensional model, and the three-dimensional animation on the joint level is formed, so that the three-dimensional model animation is more vivid, and the user experience is improved.

With reference to the third implementation manner of the first aspect, in a fourth implementation manner of the first aspect, the candidate three-dimensional model further includes: a texture mapped topology comprising texture mapped coordinates of vertices in a mesh of the candidate three-dimensional model.

In the embodiment of the application, the texture mapping of the three-dimensional model can be supported through the topological structure of the texture mapping of the candidate three-dimensional model, and the attractiveness of the three-dimensional model is improved.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a fifth implementation manner of the first aspect, the candidate three-dimensional models in the candidate three-dimensional model library include quadruped three-dimensional models and/or quadruped three-dimensional models; and/or the candidate three-dimensional models in the candidate three-dimensional model library comprise global three-dimensional models and/or local three-dimensional models.

Based on the technical scheme of the embodiment of the application, the candidate three-dimensional model is not limited to a quadruped animal three-dimensional model, but also can include a quadruped animal three-dimensional model, or can be expanded to more types of three-dimensional models.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a sixth implementation manner of the first aspect, the performing deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object includes: carrying out model conversion on the initial three-dimensional model to obtain an intermediate three-dimensional model; and deforming the intermediate three-dimensional model according to the key points in the two-dimensional image to obtain the target three-dimensional model.

According to the technical scheme of the embodiment of the application, the three-dimensional model can be further subjected to deformation processing according to the two-dimensional image of the information input by the user, so that the three-dimensional model with more personalized characteristics is obtained, and the vividness of the three-dimensional model is further improved.

With reference to the sixth implementation manner of the first aspect, in a seventh implementation manner of the first aspect, the performing model transformation on the initial three-dimensional model to obtain an intermediate three-dimensional model includes: carrying out attitude normalization on the initial three-dimensional model; and processing the initial three-dimensional model after the attitude normalization by adopting a principal component analysis algorithm to obtain the intermediate three-dimensional model.

With reference to the sixth or seventh implementation manner of the first aspect, in an eighth implementation manner of the first aspect, the deforming the intermediate three-dimensional model according to the key points in the two-dimensional image to obtain the target three-dimensional model includes: and performing energy optimization on the intermediate three-dimensional model and an energy equation corresponding to the two-dimensional image to deform the intermediate three-dimensional model to obtain the target three-dimensional model, wherein the energy equation comprises key points in the two-dimensional image and energy equation terms of feature points in the intermediate three-dimensional model corresponding to the key points.

With reference to the first aspect and the foregoing implementation manner, in a ninth implementation manner of the first aspect, before performing deformation processing on the initial three-dimensional model to obtain the target three-dimensional model, the generating method further includes: acquiring the color and/or texture of the target object according to the user input information; mapping the color and/or texture of the target object to the initial three-dimensional model.

With reference to the ninth implementation manner of the first aspect, in a tenth implementation manner of the first aspect, the mapping the color and/or texture of the target object to the initial three-dimensional model includes: mapping the color and/or texture of the target object to the initial three-dimensional model according to the topology of the texture mapping of the initial three-dimensional model.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in an eleventh implementation manner of the first aspect, the user input information includes voice and/or text; the acquiring of the two-dimensional image of the target object according to the user input information includes: performing semantic recognition on the user input information to acquire first semantic information in the user input information, wherein the first semantic information comprises information of the target object; and matching in an image library according to the first semantic information to obtain the two-dimensional image of the target object.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a twelfth implementation manner of the first aspect, the user input information includes: an image; the acquiring of the two-dimensional image of the target object according to the user input information includes: acquiring contour information in the image; and matching in an image library to obtain the two-dimensional image of the target object according to the contour information.

According to the technical scheme of the embodiment of the application, under the condition of user input information of different modalities, the two-dimensional image of the target object can be obtained, so that unified first structured data can be obtained, the unified first structured data can be conveniently matched with the candidate three-dimensional model in the subsequent process, and therefore the three-dimensional model consistent with the user input information is further generated.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a thirteenth implementation manner of the first aspect, the generating method further includes: and adding the target three-dimensional model into the candidate three-dimensional model library to serve as a candidate three-dimensional model in the candidate three-dimensional model library.

Based on the technical scheme of the embodiment of the application, the deformed target three-dimensional model can be added into the candidate three-dimensional model library to serve as one of the candidate three-dimensional models, and the deformed target three-dimensional model is possibly a new species type compared with the candidate three-dimensional model or the initial three-dimensional model, so that the species type in the candidate three-dimensional model library can be gradually improved by adding the deformed target three-dimensional model into the candidate three-dimensional model library, and in the subsequent three-dimensional model generation process, if the species type of the target object in the user input information is the same as the species type of the candidate three-dimensional model in the candidate three-dimensional model library, the candidate three-dimensional model can be directly output to a user as the target three-dimensional model, the subsequent deformation process is omitted, and the generation efficiency of the target three-dimensional model is improved.

With reference to the first aspect and the foregoing implementation manner of the first aspect, in a fourteenth implementation manner of the first aspect, the method is applied to an educational scene of augmented reality AR and/or virtual reality VR.

In a second aspect, a computer-readable storage medium is provided, which stores a computer program for implementing the method for generating a three-dimensional model in the first aspect or any implementation manner of the first aspect when the computer program is executed by a processor.

In a third aspect, an apparatus for generating a three-dimensional model is provided, including: the method comprises a processor and a memory, wherein the memory is used for storing a computer program, and the processor is used for calling the computer program and executing the method for generating the three-dimensional model in the first aspect or any implementation manner of the first aspect.

In a fourth aspect, an electronic device is provided, comprising: a three-dimensional model generation apparatus as in the third aspect.

With reference to the fourth aspect, in some implementations of the fourth aspect, the electronic device further includes: an input device comprising at least one of: a voice input device, a character input device and an image input device.

Drawings

Fig. 1 is a block diagram of a logical structure of a processing system according to an embodiment of the present application.

Fig. 2 is a schematic flow chart of a method for generating a three-dimensional model according to an embodiment of the present application.

FIG. 3 is a schematic diagram of a two-dimensional image of a target object searched in an image library according to a sketch input by a user.

Fig. 4 is a schematic diagram of a binarized image and skeleton lines thereof according to an embodiment of the present disclosure.

Fig. 5 is a schematic diagram of a process for generating a three-dimensional model corresponding to the method shown in fig. 2 according to an embodiment of the present application.

Fig. 6 is a schematic flow chart diagram of another method for generating a three-dimensional model provided according to an embodiment of the present application.

Fig. 7 is a schematic diagram of a GLoSS model provided according to an embodiment of the present application.

Fig. 8 is a schematic flow chart diagram of another method for generating a three-dimensional model provided according to an embodiment of the present application.

Fig. 9 is a schematic flow chart diagram of another method for generating a three-dimensional model provided according to an embodiment of the present application.

FIG. 10 is a schematic diagram of an initial three-dimensional model fused with color/texture image segments to form a target three-dimensional model according to an embodiment of the present application.

Fig. 11 is a schematic structural block diagram of an apparatus for generating a three-dimensional model according to an embodiment of the present application.

Fig. 12 is a schematic structural block diagram of another three-dimensional model generation apparatus provided according to an embodiment of the present application.

Fig. 13 is a schematic structural block diagram of an electronic device provided according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings.

It should be understood that the specific examples are provided herein only to assist those skilled in the art in better understanding the embodiments of the present application and are not intended to limit the scope of the embodiments of the present application.

It should also be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic of the processes, and should not constitute any limitation to the implementation process of the embodiments of the present application.

It should also be understood that the various embodiments described in this specification can be implemented individually or in combination, and the examples in this application are not limited thereto.

Unless otherwise defined, all technical and scientific terms used in the examples of this application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

First, a block diagram of a logical structure of a processing system 10 provided in the present application is introduced

As shown in fig. 1, the hardware layer of the Processing system 10 includes a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and/or the like.

Optionally, the hardware layer of the processing system 10 may further include a memory, an input/output device, a memory controller, a network interface, and the like.

The input device may include, by way of example and not limitation, a keyboard, a mouse, a touch screen, a microphone, a tablet, etc., and may include means for detecting a user operation and generating user operation information indicating the user operation.

The output device may be used to present visual information such as a user interface, images, or video, and may include, by way of example and not limitation, a Display device such as a Liquid Crystal Display (LCD), Cathode Ray Tube (Cathode Ray Tube) Display, Holographic (Holographic) Display, or projection (Projector), among others.

Above the hardware layer, an operating system (such as Windows, Linux, Android, etc.) and some application programs may be run. The Kernel library layer is a core part of the operating system, and includes Input/Output services (I/O services), Kernel services (Kernel services), a Graphics device interface, and a Graphics Engine (Graphics Engine) for implementing CPU and GPU Graphics processing. The graphics engine may include a 2D engine, a 3D engine, a compositor (Composition), a Frame Buffer (Frame Buffer), and the like. The processing system 10 includes, among other things, a driver layer, a framework layer, and an application layer. The driver layer may include a CPU driver, GPU driver, display controller driver, etc. The framework layer can comprise a Graphic Service (Graphic Service), a System Service (System Service), a Web Service (Web Service), a user Service (Customer Service) and the like; the graphic service may include, for example, a Widget (Widget), a Canvas (Canvas), a view (Views), a Render Script (Render Script), and the like. The application layer may include a desktop (Launcher), a Media Player (Media Player), a Browser (Browser), and the like.

Taking fig. 1 as an example, the method for generating a three-dimensional model provided in the embodiment of the present application is applied to the processing system 10, where the processing system 10 includes but is not limited to at least one electronic device, and the electronic device includes but is not limited to a terminal device and/or a server, and the like, where the terminal device may be, for example, a computer, a mobile phone, a wearable mobile device, and the like, and this application is not limited in this respect.

In the related art, the generation method of the three-dimensional model mainly has two technical routes, namely a component splicing technology and a model deformation technology. The component splicing technology realizes generation of more three-dimensional models from a small amount of three-dimensional model samples by splitting and recombining components of the models. And (4) restraining through a structural representation mode of the parts so as to control the integral consistency of the parts after splicing. The model deformation technology is used for carrying out local and global deformation on a model, realizing personalized feature editing on the premise of model topology constraint and further obtaining a new three-dimensional model.

The method for generating the three-dimensional model based on the component splicing technology has structural constraint on the components, but has no more accurate and uniform representation form for the components and depends on the accuracy of component segmentation. For the non-rigid deformation, the motion migration and the local editing of the model, the effective control and optimization of the part splicing are lacked, so that the obtained three-dimensional model is poor in attractiveness. The part splicing technology is mainly applied to rigid models, model information of non-rigid models such as animals and people obtained by editing parts can be damaged, the universality of the models is poor, and the feature reconstruction difficulty of personalized requirements of users is increased.

A method for generating a three-dimensional model based on a model deformation technology needs to establish an individualized model by learning a certain topological constraint rule on the premise that the models to be generated have the same structure. Because of depending on topological rules, the method is suitable for personalized editing of a certain specific model with the same structure, and large deformation characteristics are difficult to obtain a new three-dimensional model. Therefore, the method also has certain limitations, and is difficult to meet various personalized requirements of users.

Based on the above problems, the application provides a new method for generating a three-dimensional model, which can generate a three-dimensional model satisfying the user requirements, having an attractive appearance and universality according to the personalized requirements of the user.

The three-dimensional model can be applied to various computer technologies, for example, to application computer technologies such as Augmented Reality (AR) and/or Virtual Reality (VR). In addition, the three-dimensional model can also be adapted to a variety of different scenarios to accommodate different types of needs of different types of customers, and in some embodiments, the three-dimensional model can be adapted to a teaching application scenario that can satisfy the needs of the three-dimensional model in the teaching application scenario. Further, the three-dimensional model can be suitable for the AR and/or VR education scene, and the generated three-dimensional model can bring good visual experience to teachers and students.

Fig. 2 shows a schematic flow chart of a method 100 for generating a three-dimensional model according to an embodiment of the present application. Alternatively, the method 100 may be performed by the processing system 10 shown in FIG. 1 above as the executing agent. By way of example, the processing system may include a server and/or a terminal, and the method 100 may be performed by the server and/or the terminal.

As shown in fig. 2, the method 100 for generating the three-dimensional model may include the following steps.

S110: and acquiring a two-dimensional image of the target object according to the user input information.

S120: and performing feature extraction on the two-dimensional image to obtain first structured data of the two-dimensional image, wherein the first structured data comprises a skeleton line of the target object.

S130: and matching in a candidate three-dimensional model library according to the first structured data to obtain an initial three-dimensional model.

S140: and carrying out deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object.

Specifically, in step S110, the user input information is multi-modal information, including but not limited to one or more forms of input information, such as voice, text, lines, and pictures.

As an example, if the user input Information is voice or text, Semantic recognition may be performed on the voice or text to obtain Semantic Information (Semantic Information) in the voice or text. Specifically, after semantic recognition, the content of the voice or the text can be decomposed into interpretable linguistic data such as nouns, verbs and adjectives, and the voice or the text input by the user is decomposed into first semantic information and second semantic information based on the linguistic data after word segmentation, wherein the first semantic information comprises: the second semantic information comprises the following relevant information of the target object in the voice or the text input by the user: information related to the characterization of the target object.

The related information of the target object includes the type of the three-dimensional model to be generated, and the related information of the feature description of the target object may include the pose, color, texture, or the like of the target object. For example: if the user input information is 'one running white horse', the target object obtained by decomposition after semantic recognition is 'horse', and the characteristics are described as 'one', 'running' and 'white'.

Optionally, in some embodiments, a two-dimensional image of the target object may be obtained by matching in the image library according to the first semantic information. As an example, in an image library in which tags are stored, a two-dimensional image including the target object is obtained by matching in the image library through tag semantic analysis. Optionally, in other embodiments, a two-dimensional image that includes the target object and satisfies or is close to satisfying the feature description may be obtained by matching in the image library according to the first semantic information and the second semantic information.

As another example, if the user input information is an image, such as a picture, including a color picture or a black and white picture, in some embodiments, the picture may be directly used as a two-dimensional image of the target object. In other embodiments, image recognition may also be performed on a picture input by a user, a target object in the picture and an associated feature description are obtained, and a two-dimensional image of the target object is searched in an image library based on the target object and the feature description.

As another example, if the user input information is a line graph, for example, the line represents a sketch outline of the target object, in some embodiments, the user-drawn sketch outline may be analyzed, and an image closest to the sketch outline may be searched in the image library as a two-dimensional image of the target object, or in other embodiments, the user-drawn sketch outline may be directly used as the two-dimensional image of the target object. Fig. 3 shows a schematic diagram of the example in which a two-dimensional image of a target object is searched in an image library according to a sketch input by a user.

In the above example, after processing the sketch outline formed by the voice, the text, the picture and the line of the information input by the user, a two-dimensional image of the target object may be obtained, it is understood that the target object is used to indicate the type of the three-dimensional model required by the user, for example, the type of different animals (animals in any type such as biddens and quadrupeds), and the target object may be any one of naturally-growing, artificially-created or artificially-imagined objects in nature or even in non-nature, and the target object is not particularly limited in the embodiment of the present application.

Alternatively, the user input information may comprise only single-modal information, such as only voice, text, pictures or line sketches, etc., alternatively, the user input information may also comprise multi-modal information, such as both voice and pictures, or text and sketches, etc. The embodiment of the present application is not limited to a specific form of the user input information.

In the above example, the methods of speech recognition, image recognition, sketch-based image retrieval, and the like may be referred to as specific schemes in the related art, and for example, Natural Language Processing (NLP) algorithm may be used for semantic recognition, deep learning (deep learning) algorithm may be used for image recognition, and matching algorithm may be used for image retrieval. Of course, besides the above-described exemplary algorithm scheme, other methods related to the technical method may also be adopted for implementation, and this specific method is not limited in the embodiment of the present application and is not described herein again.

In step S120, feature extraction is performed on the two-dimensional image of the target object obtained in step S110 according to the user input information, so as to obtain first structured data of the two-dimensional image.

In particular, the first structured data of the two-dimensional image may be used to characterize structural features of the target object. For example, if the target object is an animal or a human, the first structured data may be used to characterize the structural features of its skeleton and, further, the structural features of the joint hierarchy in the skeleton.

In some embodiments, the first structured data may include a Skeleton line (skeeleton) of the target object, and the Skeleton line refers to an object represented by a thin curve consistent with the connectivity and topology of the original shape as an ideal expression. In brief, it is a simplified diagram of the shape of an object that is inside the object and features its shape, with the branch portions describing the shape bulge, and with the ring portion describing the shape of the internal cavity. For example, if the target object is an animal, its skeleton line may represent the topology of the joint level of the animal.

In particular, skeleton lines are an important topological description of image geometry, consisting of thin lines that indicate the general shape of an object. The original image is represented by the skeleton line, so that redundant information in the image can be reduced on the premise of keeping important topological characteristics of the image.

Alternatively, a plurality of related technologies may be used to perform feature extraction on the two-dimensional image of the target object to obtain the first structured data of the two-dimensional image, for example, a Thinning Algorithm (think Algorithm) may be used to perform skeleton line extraction on the binarized image corresponding to the two-dimensional image to obtain the first structured data.

As an example, one implementation of the refinement algorithm is described below:

firstly, performing binarization processing on a two-dimensional image to obtain a binarized image;

then, scanning the binary image, and if a pixel point with a pixel value of 1 is encountered, applying the following template to process the pixel point as follows:

P9	P2	P3
			P8	P1	P4
P7	P6	P5

this pixel is taken as P1 in the template, i.e. the center pixel. If the pixel meets the following four conditions, the pixel value of the pixel P1 is set to 0, i.e., the pixel P1 is deleted, or the pixel P1 is corroded.

Condition (a): b (P1) is more than or equal to 2 and less than or equal to 6;

condition (b): a (P1) ═ 1;

condition (c): P2P 4P 6 ═ 0;

condition (d): P4P 6P 8 ═ 0;

wherein, when A (P1) is in the order of P2, P3, … … and P9 (clockwise direction), the times of 0 transformation to 1 of two adjacent pixels are shown; b (P1) is the number of pixels in P2, P3, … …, and P9 that are non-zero (i.e., have a pixel value of 1).

And iterating according to the mode, continuously corroding pixel points in the binarized image, wherein the binarized image becomes thinner and thinner until no new pixel points of the binarized image after the last corrosion are corroded in the iteration of the current round, and ending the iteration at the moment to obtain a skeleton line of the binarized image.

As an example, fig. 4 shows a schematic diagram of a binarized image and its skeleton line. The target object in the binarized image is a horse, and the skeleton line of the obtained binarized image is the skeleton line of the horse, wherein the skeleton line comprises the structural features of the horse joint level.

It is understood that the above embodiment is only one implementation manner of a plurality of refinement algorithms, and other implementation manners of refinement algorithms in the related art or other skeleton line extraction manners besides the refinement algorithm may also be adopted, and this is not specifically limited in this embodiment of the present application.

By the technical scheme, the unified first structured data of the target object can be obtained under the user input information of different modes, and the unified first structured data is convenient to match with the candidate three-dimensional model in the subsequent process. In addition, the first structured data may include, in addition to the skeleton line of the target object, contour or cell information corresponding to the skeleton line, or may also include other structured data for characterizing the structural information of the target object.

Specifically, in step S130, an initial three-dimensional model is obtained by matching in the candidate three-dimensional model library according to the first structured data obtained in step S120.

In the embodiment of the application, a candidate three-dimensional model library including at least one candidate three-dimensional model is further provided, and after unified first structured data of a target object is obtained, the initial three-dimensional model is obtained in the candidate three-dimensional model library through establishing correspondence between the first structured data and second structured data of the candidate three-dimensional model and optimizing a corresponding energy equation. Optionally, the skeleton line of the initial three-dimensional model is consistent or nearly consistent with the skeleton line of the target object to be created.

Optionally, in some embodiments, at least one candidate three-dimensional model in the library of candidate three-dimensional models has a uniform representation, for example, the at least one candidate three-dimensional model may be a Global/Local Stitched Shape (GLoSS) representation (or, at least one candidate three-dimensional model is a GLoSS model), and a template three-dimensional model with a good topology and rich posture information may be obtained through the GLoSS representation, and performing model transformation based on the template three-dimensional model has a good transformation processing effect.

Further, in the embodiment of the present application, the at least one candidate three-dimensional model includes, but is not limited to, a three-dimensional model of a quadruped animal topology or a three-dimensional model of a quadruped animal topology, and may also be a three-dimensional model of other multiple topologies, so as to meet different types of model changes in more scenes, so that the method for generating a three-dimensional model provided by the embodiment of the present application has wider applicability.

Specifically, in step S140, based on the initial three-dimensional model obtained by the matching according to the user input information, deformation processing is performed on the initial three-dimensional model to obtain a target three-dimensional model of the target object.

Further, in this step, the initial three-dimensional model is subjected to deformation processing, so that a plurality of personalized deformation models can be generated, including fusion of local and global geometric features, thereby obtaining a plurality of three-dimensional models with different forms and meeting the requirements of users.

To facilitate understanding of the technical solution of the present application, fig. 5 shows a schematic diagram of a generation process of a target three-dimensional model corresponding to the method 100. As shown in fig. 5 (a), the user input may be in a multi-modal input manner, and in fig. 5 (a), the user input information includes information related to "horse", and the "horse" is a target object corresponding to the user input information. As shown in (b) of fig. 5, first structured data of "horse" such as a skeleton line of "horse" shown in the figure may be acquired based on a two-dimensional image of "horse". As shown in fig. 5 (c), based on the skeleton line of "horse" shown in (b), an initial three-dimensional model with the same structural features of "horse" can be obtained by matching in the three-dimensional model library. As shown in fig. 5 (d), deforming the initial three-dimensional model of the "horse" can result in an optimized target three-dimensional model of the "horse".

In the present application, how to obtain unified first structured data (i.e., step S110 and step S120 shown in fig. 2) according to the user input information is described in detail above with reference to fig. 2 to 5, and in the following application, how to generate a target three-dimensional model of a target object (i.e., step S130 and step S140 shown in fig. 2) according to the first structured data is further described with reference to fig. 6 to 8.

Fig. 6 shows a schematic flow chart of another method 100 for generating a three-dimensional model provided by the embodiment of the present application.

As shown in fig. 6, the above step S130 may include the following steps.

S131: and acquiring second structured data corresponding to the three-dimensional models in the candidate three-dimensional model library.

S132: and establishing an energy equation corresponding to the three-dimensional model of the first structured model data and the second structured data.

S133: and matching to obtain an initial three-dimensional model according to the optimization result of the energy equation.

Optionally, the candidate three-dimensional model in the candidate three-dimensional model library in the embodiment of the present application is a GLoSS model, and specifically, the GLoSS model is a three-dimensional articulated model in which a body shape deformation of each component is locally definable. As an example, fig. 7 shows a schematic diagram of a GLoSS model.

Optionally, the GLoSS s model includes a plurality of partitions, and each partition of the plurality of partitions includes standard data of four dimensions: position parameters, rotation parameters, shape parameters, and pose parameters. For example, the standard data of the ith block among the plurality of blocks is respectively expressed as: location parameter I_iRotation parameter r_iShape parameter s_iAnd an attitude parameter d_i. Thus, the global geometry information for each partition in the GLoSS model can be expressed as:

where π represents a set of standard data in four dimensions, i.e., π_i＝{I_i,r_i,s_i,d_i}。p_iThe local coordinates of the ith block are represented, which can be specifically represented as:

vec(p_i)＝t_i+m_p,i+B_s,is_i+B_p,id_i；

wherein the parameter t_iFor a segmented form, m_p,iAs vectors of the model at the mean attitude, B_s,iRepresenting a shape space matrix, B_p,iRepresenting a pose space matrix.

The pose space is trained by a set of standard models with data of different poses. By main component divisionAnalysis (PCA) algorithm, obtaining an average attitude vector m under a local coordinate system_p,iAnd an attitude space matrix B in which offset data is stored_p,i。

The shape space is defined by the definition of the respective parts. The space contains seven deformations to the segmented template including the overall dimension, the x-axis dimension, the y-axis dimension, the z-axis dimension, and the stretch along the three axes. Stretching can be converted to a combination of dimensional changes, such as stretching in the x-axis direction meaning that the y-axis and z-axis are scaled simultaneously.

In this embodiment of the application, the structured data of each block in the aforementioned GLoSS s model includes at least one of the aforementioned standard data of four dimensions, and in the candidate three-dimensional model library, the structured data of a plurality of blocks in the candidate GLoSS model is the second structured data of the candidate GLoSS model.

Further, after second structured data of the candidate GLoS model is acquired, energy optimization is carried out on the second structured data and the first structured data acquired according to the user input information, so that an initial three-dimensional model is obtained through matching, and the initial three-dimensional model is the initial GLoS model of the candidate GLoS model after the energy optimization.

Specifically, the energy equation of the second structured data and the first structured data can be expressed as:

E(Π)＝E_m(d,s)+E_stitch(Π)+E_curv(Π)+E_data(Π)+E_pose(r)；

the energy equation can be understood as a matching and reconstruction process of each partition in the candidate GLoSS s model to the first structured data, and each term in the equation corresponds to each optimization objective. Wherein E is_mRepresenting model shape optimization energy, E_stitchRepresenting the distance of points inside the associated block, E_curvRepresenting similar energy of curvature, E_dataRepresenting the sum of the distances of the points from the partitions to the first structured data and the sum of the distances of the key points, E_poseRepresenting the tail gesture in the gesture movement.

The specific calculation method of each energy is as follows:

wherein E is_m(d, s) are model terms, E_smConstraining terms are imposed on the shape of the symmetrical parts, which is advantageous for parts of the torso to have similar lengths. E_sMahalanobis distance, E, expressed as a distribution of shapes_dIndicating the L2 paradigm distance. Parameter k_sm、k_s、k_dAnd hereinafter k_st、k_c、k_kp、k_m2sAnd k_s2mThe equal parameters are preset parameters, which can be related parameters obtained through data training.

Wherein, C_ijAnd corresponding to the point points from the ith block to the jth block. And C represents the corresponding relation of the blocks.

E_data(Π)＝k_kpE_kp(Π)+k_m2sE_m2s(Π)+k_s2mE_s2m(Π)；

Wherein E is_m2s(Π) is the distance from the second structured data to the first structured data in the candidate GLoSS model, E_s2m(II) is the distance from the first structured data to the second structured data in the candidate GLoSS model, S represents the vertex of the first structured data, and the function rho is Geman-McClure robustAn error equation. E_kp(Π) is used to match keypoints on the candidate GLoSS model to keypoints on the first structured data, the term being calculated as the sum of squares of distances between corresponding keypoints.

Through the optimization of the energy equation, the candidate GLoSS model can be reconstructed, and an initial three-dimensional model matched with the first structured data is obtained.

Optionally, in this embodiment of the present application, the candidate three-dimensional models in the candidate three-dimensional model library may include a global three-dimensional model and/or a local three-dimensional model. Optionally, in an embodiment of the present application, the candidate three-dimensional models in the candidate three-dimensional model library include a quadruped three-dimensional model and/or a biped three-dimensional model.

It should be noted that, in the embodiment of the present application, the initial three-dimensional model is obtained by reconstructing a candidate three-dimensional model, and therefore, optionally, the initial three-dimensional model may also include a global three-dimensional model and/or a local three-dimensional model; optionally, the initial three-dimensional model may also comprise a quadruped three-dimensional model and/or a biped three-dimensional model.

Specifically, in the embodiment of the present application, the candidate three-dimensional model and the initial three-dimensional model are not limited to a quadruped three-dimensional model, but may also include a quadruped three-dimensional model, or extend to more types of three-dimensional models, and further, the candidate three-dimensional model and the initial three-dimensional model are not limited to a global three-dimensional model, and may also include a local three-dimensional model, for example, only a local three-dimensional model such as a head, so as to improve the universality of the three-dimensional model generation method in the embodiment of the present application.

Compared to the three-dimensional models in the related art, the candidate three-dimensional model and the initial three-dimensional model in the embodiment of the present application may include at least one of the following topologies: topology of vertex meshes, topology of texture maps, and topology of joint levels. Specifically, the topology structure of the vertex Mesh includes an arrangement and/or a connection relationship of each vertex in the Mesh (Mesh), the topology structure of the texture mapping includes texture mapping coordinates of each vertex in the Mesh, and the topology structure of the joint level includes a skeleton line.

Specifically, in the embodiment of the present application, in addition to representing the contour information of the candidate three-dimensional model and the initial three-dimensional model by the topology structures of the vertex meshes of the candidate three-dimensional model and the initial three-dimensional model, the change processing of the three-dimensional model at the joint level can be supported by the topology structures of the joint levels of the candidate three-dimensional model and the initial three-dimensional model, and a three-dimensional animation at the joint level is formed, so that the three-dimensional model animation is more vivid, and the user experience is improved. In addition, the texture mapping of the three-dimensional model can be supported through the topological structures of the texture mapping of the candidate three-dimensional model and the initial three-dimensional model, and the attractiveness of the three-dimensional model is improved.

Therefore, based on the technical scheme of the embodiment of the application, the initial three-dimensional model is obtained by establishing the candidate three-dimensional model library and performing matching reconstruction on the candidate three-dimensional model in the candidate three-dimensional model library according to the first structured data determined by the user input information, so that the structured information of the initial three-dimensional model is matched with the user input information. Furthermore, the candidate three-dimensional model and the initial three-dimensional model after matching reconstruction can comprise various topological structures, and based on the various topological structures, various processing such as joint hierarchy and texture mapping of the three-dimensional model can be supported, and the editability and the overall aesthetic degree of the three-dimensional model are improved.

Specifically, after the initial three-dimensional model is obtained in step S130, the structural information of the initial three-dimensional model is consistent with or close to the user input information. However, the initial three-dimensional model cannot be output to the user as the final three-dimensional model, because the personalized feature matching in the model is not further reflected in protecting the structural constraints and optimization. Therefore, the initial three-dimensional model needs to be further subjected to deformation processing, so as to further embody personalized features, and obtain a target three-dimensional model which is finally output to a user.

Fig. 8 shows a schematic flow chart of another method 100 for generating a three-dimensional model provided by the embodiment of the present application.

As shown in fig. 8, the above step S140 may include the following steps.

S141: and carrying out model conversion on the initial three-dimensional model to obtain an intermediate three-dimensional model.

S142: and deforming the middle three-dimensional model according to the key points in the two-dimensional image to obtain the target three-dimensional model.

Specifically, in the embodiment of the present application, in order to facilitate performing deformation processing on the initial three-dimensional model according to personalized features in the two-dimensional image, so as to mix the initial three-dimensional model with the personalized features, for example, pose features, to obtain a target three-dimensional model, the initial three-dimensional model may be first subjected to model conversion, so as to obtain an intermediate three-dimensional model. Specifically, the intermediate three-dimensional model can be driven to generate deformation changes by using a low-dimensional feature vector.

Alternatively, the intermediate three-dimensional model may be a skinned multi-animal linear model (SMAL) model, or the intermediate three-dimensional model may take the form of a SMAL representation.

As an example, if the initial three-dimensional model is a GLoSS model and the intermediate three-dimensional model is a SMAL model, the following method may be adopted to convert the GLoSS model into the SMAL model:

first, the above-mentioned glos model is subjected to pose normalization to eliminate the influence of non-linear factors of each vertex in the partitions of the glos model, and for example, a Linear Blending Skinning (LBS) algorithm may be used to perform the pose normalization on the glos model. When the pose is eliminated, the shape differences between the different models can be measured in Euclidean space.

And then, obtaining an average three-dimensional model and a corresponding shape deviation matrix by using a principal component analysis algorithm for the GLoS model with the normalized posture, and carrying out energy equation optimization on the SMAL model based on the shape deviation matrix and relevant parameters in the GLoS model so as to obtain the SMAL model corresponding to the GLoS model.

Specifically, the SMAL model may be expressed as a function M (β, θ, γ) of a shape parameter β, an attitude parameter θ and a displacement parameter γ, where the shape parameter β corresponds to a low-dimensional feature vector of the shape offset matrix, the attitude parameter θ is used to record a rotation angle of each block in the GLoSS model, and the displacement parameter γ is used to record a global displacement of each node in the GLoSS model relative to a root node.

The energy equation for registration of the GLoSS model with the SMAL model can be expressed as follows:

E(β,θ,γ)＝E_pose(θ)+E_s(β)+E_data(β,θ,γ)；

wherein E is_pose(theta) and E_s(beta) is the square of the Mahalanobis distance between the SMAL model and the GLoSS model calculated for the attitude parameter theta and the shape parameter beta, respectively, E_dataThe calculation of (. beta.,. theta., γ.) can be found in the above-mentioned manner E_data(Π) calculation method.

By minimizing the energy equation, a secondary reconstruction result of the GLoSS model can be obtained, namely the SMAL model is obtained, and the SMAL model can be conveniently subjected to deformation processing subsequently to obtain a target three-dimensional model expected by a user.

After the intermediate three-dimensional model, namely the SMAL model, is obtained, the SMAL model is further matched and reconstructed with the two-dimensional image, deformation of the SMAL model is realized through energy optimization of specific set information, and the target three-dimensional model expected by a user is obtained.

Specifically, to match the SMAL model to a two-dimensional image, a minimization calculation may be performed on the following equation:

E(Θ)＝E_kp(Θ,x)+E_silh(Θ,S)+E_β(β)+E_θ(θ)+E_lim(θ)；

the parameter Θ represents a parameter set to be optimized, and is composed of { β, θ, γ, f }, where f represents a focal length. In order to adapt the SMAL model to the object in the two-dimensional image, energy optimization needs to be performed on the parameters in the Θ set, and the optimal matching from the SMAL model to the two-dimensional image is obtained.

First term E of the energy equation_kpFor the energy matching of the key points, the expression is as follows:

the parameter rho represents a Geman-McClure error equation, the label x represents the position of a key point (also called a feature point) calibrated on a two-dimensional image, and V_kjRepresenting the corresponding feature point position on the SMAL model, wherein in order to prevent the calibration inaccuracy, for each feature point in the two-dimensional image, the corresponding relation between the feature point and the feature point on the SMAL model is not one-to-one, but one-to-many, that is, one x point corresponds to one three-dimensional point set, k_mThe number of key points in a three-dimensional point set. And averaging the sets to reduce the influence of the calibration error of the key points.

The second term of the energy equation, esill, is contour optimization and is expressed as follows:

where the parameter S is the true contour and Ds is its distance field, and if x is within the contour, Ds (x) is 0.

The third and fourth terms of the energy equation correspond to shape and pose optimization, consistent with previous SMAL model set-up.

To prevent attitude anomaly, it is necessary to introduce a limit energy for attitude in the above energy equation, i.e. the fifth term E of the energy equation_lim：

E_lim(θ)＝max(θ-θ_max,0)+max(θ_min-θ,0)；

Parameter theta_maxAnd theta_minThe maximum and minimum values corresponding to each dimension of theta are preset rotation angle values.

Next, the optimization process for the above energy equation is performed. Firstly, the depth coordinate of the initialized gamma of the key point is used, and then the rotation angle theta of each block is determined by using the key point_iAnd γ based on pair E_kpEnergy optimization. This completes the initial initialization, so that E is not optimized_silhOn the premise of the item, the parameters within the entire Θ set have been determined.

Then, a hierarchical method may be used to determine the shape and the pose parameters to avoid falling into the local optimum, and the specific method may refer to a method for determining the shape and the pose parameters in the related art, which is not described herein in detail.

Finally introducing an energy term E_silhDetermining the focal length is important and the algorithm forces gamma to approach the original estimate by planning f to all optimization projects.

By the optimization processing process of the energy equation, deformation processing of the SMAL model based on the characteristic points in the two-dimensional image can be completed, and therefore the target three-dimensional model expected by the user is obtained.

According to the technical scheme of the embodiment of the application, the three-dimensional model can be further subjected to deformation processing according to the two-dimensional image of the information input by the user, so that the three-dimensional model with more personalized characteristics is obtained, and the vividness of the three-dimensional model is further improved. In the embodiment of the application, the initial GLoSS model is converted into the intermediate SMAL model, the SMAL model has a good matching function with the two-dimensional image, and then deformation processing is performed on the SMAL model according to the two-dimensional image, so that the deformation processing is efficient and excellent in performance.

Fig. 9 shows a schematic flow chart of another method 100 for generating a three-dimensional model provided by the embodiment of the present application.

As shown in fig. 9, before the step S140, the method 100 for generating a three-dimensional model according to the embodiment of the present application may further include the following steps.

S150: and acquiring the color and/or texture of the target object according to the user input information.

S160: color and/or texture is mapped to the initial three-dimensional model.

Referring to the above description related to step S110 in fig. 1, in some examples, if the user input information is voice or text, the color information and/or texture information of the target object may be obtained by obtaining second semantic information therein, that is, the related information including the feature description of the target object, and based on the color information and/or texture information, the corresponding image segment may be obtained by matching in the image library. Or, if a two-dimensional image including color and/or texture is obtained by directly matching in the image library according to the first semantic information and the second semantic information, the image segment including color and/or texture may be directly obtained based on the two-dimensional image.

In other examples, if the user input information is an image, for example, if the user input information is a color picture, the color and/or texture in the image may be directly extracted, or the image may be subjected to image recognition, color information and/or texture information of a target object in the picture is obtained, and a corresponding image segment is obtained by matching in the image library based on the color information and/or texture information.

In other examples, if the user input information is a line drawing, a two-dimensional image of the target object is obtained according to the line drawing, and color information and/or texture information of the target object may be obtained based on related information in the two-dimensional image, so that a corresponding image segment is obtained by matching in the image library based on the color information and/or the texture information.

Further, the image segment including the color and/or texture of the target object is fused with the initial three-dimensional model, so that the initial three-dimensional model fused with the color and/or texture can be obtained. Optionally, the initial three-dimensional model includes a texture mapping topology, the texture mapping topology includes texture mapping coordinates of vertices in a mesh, an image fragment including a color and/or a texture of the target object is mapped into the initial three-dimensional model according to the texture mapping coordinates, and then the initial three-dimensional model having the color and/or the texture is subjected to deformation processing to obtain a final more vivid target three-dimensional model having personalized features and including the color and/or the texture, so as to further improve user experience.

By way of example, FIG. 10 illustrates a schematic diagram of an initial three-dimensional model fused with color/texture image segments to form a target three-dimensional model having a more optimal visual effect.

In the embodiment of the present application, the initial three-dimensional model has a uniform representation form, for example, the initial three-dimensional model is a GLoSS s representation form, that is, the initial three-dimensional model is a GLoSS s model, and the GLoSS s model includes a topology structure of texture mapping, so that color and/or texture can be conveniently fused into the GLoSS s model, and a subsequent deformation process based on the GLoSS model fused with color and/or texture can be conveniently performed, and on the premise of ensuring the quality of the target three-dimensional model, the generation efficiency of the target three-dimensional model can be improved.

Optionally, in some embodiments of the application, the deformed target three-dimensional model may be added to a candidate three-dimensional model library as one of the candidate three-dimensional models, and the deformed target three-dimensional model may be a new species type compared to the candidate three-dimensional model or the initial three-dimensional model, so that the species type in the candidate three-dimensional model library is gradually improved by adding the deformed target three-dimensional model to the candidate three-dimensional model library, and in a subsequent three-dimensional model generation process, if the species type of the target object in the user input information is the same as the species type of the candidate three-dimensional model in the candidate three-dimensional model library, the candidate three-dimensional model may be directly output to the user as the target three-dimensional model, and a subsequent deformation process is omitted, so as to improve the generation efficiency of the target three-dimensional model.

In the above, the embodiment of the method for generating a three-dimensional model provided by the present application is described with reference to fig. 2 to fig. 10, and the embodiment of the apparatus for generating a three-dimensional model provided by the present application is described with reference to fig. 11 and fig. 12, it can be understood that the apparatus for generating a three-dimensional model of the following application embodiment may correspond to the execution subject of the method for generating a three-dimensional model of the above application embodiment, and the operations and/or functions of each unit in the apparatus for generating a three-dimensional model are respectively for implementing the corresponding flow of each method, and for brevity, will not be described in detail below.

Fig. 11 shows a schematic structural block diagram of an apparatus 200 for generating a three-dimensional model according to an embodiment of the present application.

As shown in fig. 11, the generating apparatus 200 of the three-dimensional model may include:

an obtaining unit 210 for obtaining user input information;

the processing unit 220 is configured to obtain a two-dimensional image of the target object according to the user input information, and perform feature extraction on the two-dimensional image to obtain first structured data of the two-dimensional image, where the first structured data includes a skeleton line of the target object;

and the generating unit 230 is configured to match the first structured data in the candidate three-dimensional model library to obtain an initial three-dimensional model, and perform deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object.

In some embodiments, the generation unit 230 is configured to: acquiring second structured data corresponding to the candidate three-dimensional models in the candidate three-dimensional model library; and matching to obtain an initial three-dimensional model according to the optimization result of the energy equation corresponding to the first structured data and the second structured data.

In some embodiments, the generation unit 230 is configured to: obtaining the structured data corresponding to each of the plurality of blocks to obtain second structured data, wherein the structured data corresponding to each block includes at least one of the following parameters: position parameters, rotation parameters, shape parameters, and pose parameters.

Optionally, the candidate three-dimensional models in the candidate three-dimensional model library include: the topology of the joint level comprises skeleton lines.

Optionally, the candidate three-dimensional model further includes: a texture mapped topology comprising texture mapped coordinates of vertices in a mesh of the candidate three-dimensional model.

Optionally, the candidate three-dimensional models in the candidate three-dimensional model library comprise quadruped animal three-dimensional models and/or quadruped animal three-dimensional models; and/or, the candidate three-dimensional models in the candidate three-dimensional model library comprise global three-dimensional models and/or local three-dimensional models.

In some embodiments, the generation unit 230 is configured to: carrying out model conversion on the initial three-dimensional model to obtain an intermediate three-dimensional model; and deforming the middle three-dimensional model according to the key points in the two-dimensional image to obtain the target three-dimensional model.

In some embodiments, the generation unit 230 is configured to: carrying out attitude normalization on the initial three-dimensional model; and processing the initial three-dimensional model after the attitude normalization by adopting a principal component analysis algorithm to obtain an intermediate three-dimensional model.

In some embodiments, the generation unit 230 is configured to: and performing energy optimization on the energy equation corresponding to the intermediate three-dimensional model and the two-dimensional image to deform the intermediate three-dimensional model to obtain a target three-dimensional model, wherein the energy equation comprises key points in the two-dimensional image and energy equation terms of feature points in the intermediate three-dimensional model corresponding to the key points.

In some embodiments, the generation unit 230 is further configured to: acquiring the color and/or texture of a target object according to user input information; color and/or texture is mapped to the initial three-dimensional model.

In some embodiments, the generation unit 230 is configured to: the color and/or texture is mapped to the initial three-dimensional model according to the topology of the texture map of the initial three-dimensional model.

In some embodiments, the user input information includes speech and/or text; the processing unit 220 is configured to: performing semantic recognition on user input information to acquire first semantic information in the user input information, wherein the first semantic information comprises information of a target object; and matching in the image library to obtain a two-dimensional image of the target object according to the first semantic information.

In some embodiments, the user input information comprises: an image; the processing unit 220 is configured to: acquiring contour information in an image; and matching in an image library to obtain a two-dimensional image of the target object according to the contour information.

In some embodiments, the generation unit 230 is further configured to: and adding the target three-dimensional model into the candidate three-dimensional model library to serve as a candidate three-dimensional model in the candidate three-dimensional model library.

In some embodiments, the generation apparatus 200 is applied in an educational scene of augmented reality AR and/or virtual reality VR.

Fig. 12 shows a schematic block diagram of another three-dimensional model generation apparatus 200 provided in the embodiment of the present application.

As shown in fig. 12, the generating apparatus 200 may include a processor 201, and further may include a memory 202.

It should be understood that the memory 202 is used to store computer-executable instructions.

The Memory 202 may be various types of memories, and may include a high-speed Random Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory, which is not limited in this embodiment of the present invention.

The processor 201 is configured to access the memory 202 and execute the computer-executable instructions to perform the operations of the method for generating a three-dimensional model according to the embodiment of the present application. The processor 201 may include a microprocessor, a Field Programmable Gate Array (FPGA), a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and the like, which is not limited in the embodiments of the present application.

Optionally, in some embodiments, the processor 201 may include the obtaining unit 210, the processing unit 220, and the generating unit 230 in fig. 11.

Fig. 13 shows a schematic structural block diagram of an electronic device provided in the present application.

As shown in fig. 13, the electronic device 20 may include: the three-dimensional model generation apparatus 200 described above.

Optionally, the electronic device 20 may further include: input device 300, by way of example, input device 300 includes but is not limited to: a voice input device, a character input device and an image input device.

The embodiment of the application also provides a computer storage medium, on which a computer program is stored, and the computer program is used for executing the method of the embodiment of the method.

The embodiment of the present application further provides a computer program product containing a computer program, where the computer program is used to execute the method of the above method embodiment.

In particular, in the above two application embodiments, the computer program, when executed by a processor of the electronic device, causes the electronic device to perform the method of the above method embodiment.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware or any other combination. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a Digital Video Disk (DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of generating a three-dimensional model, comprising:

acquiring a two-dimensional image of a target object according to user input information;

performing feature extraction on the two-dimensional image to obtain first structured data of the two-dimensional image, wherein the first structured data comprises a skeleton line of the target object;

matching in a candidate three-dimensional model library according to the first structured data to obtain an initial three-dimensional model;

and carrying out deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object.

2. The method of generating according to claim 1, wherein said matching in a library of candidate three-dimensional models from said first structured data to obtain an initial three-dimensional model comprises:

acquiring second structured data corresponding to the candidate three-dimensional models in the candidate three-dimensional model library;

and matching to obtain the initial three-dimensional model according to the optimization result of the energy equation corresponding to the first structured data and the second structured data.

3. The method according to claim 2, wherein the candidate three-dimensional model comprises a plurality of blocks, and the obtaining of the second structured data corresponding to the candidate three-dimensional model in the candidate three-dimensional model library comprises:

obtaining the structured data corresponding to each of the plurality of blocks to obtain the second structured data, wherein the structured data corresponding to each block includes at least one of the following parameters: position parameters, rotation parameters, shape parameters, and pose parameters.

4. The method of generating as claimed in claim 1, wherein the candidate three-dimensional models in the library of candidate three-dimensional models comprise: a topology of a joint level, the topology of the joint level comprising skeleton lines.

5. The method of generating as claimed in claim 4, wherein said candidate three-dimensional model further comprises: a texture mapped topology comprising texture mapped coordinates of vertices in a mesh of the candidate three-dimensional model.

6. The generation method according to any one of claims 1 to 5, characterized in that the candidate three-dimensional models in the library of candidate three-dimensional models comprise quadruped three-dimensional models and/or biped three-dimensional models; and/or the presence of a gas in the gas,

the candidate three-dimensional models in the candidate three-dimensional model library comprise global three-dimensional models and/or local three-dimensional models.

7. The generation method according to any one of claims 1 to 5, wherein the performing deformation processing on the initial three-dimensional model to obtain a target three-dimensional model of the target object includes:

carrying out model conversion on the initial three-dimensional model to obtain an intermediate three-dimensional model;

and deforming the intermediate three-dimensional model according to the key points in the two-dimensional image to obtain the target three-dimensional model.

8. The method of generating as claimed in claim 7, wherein said model converting said initial three-dimensional model to an intermediate three-dimensional model comprises:

carrying out attitude normalization on the initial three-dimensional model;

and processing the initial three-dimensional model after the attitude normalization by adopting a principal component analysis algorithm to obtain the intermediate three-dimensional model.

9. The method according to claim 7, wherein the deforming the intermediate three-dimensional model according to the key points in the two-dimensional image to obtain the target three-dimensional model comprises:

and performing energy optimization on the intermediate three-dimensional model and an energy equation corresponding to the two-dimensional image to deform the intermediate three-dimensional model to obtain the target three-dimensional model, wherein the energy equation comprises key points in the two-dimensional image and energy equation terms of feature points in the intermediate three-dimensional model corresponding to the key points.

10. The generation method according to any one of claims 1 to 5, wherein before the deforming the initial three-dimensional model to obtain the target three-dimensional model, the generation method further comprises:

acquiring the color and/or texture of the target object according to the user input information;

mapping the color and/or texture of the target object to the initial three-dimensional model.

11. The generation method according to claim 10, wherein said mapping the color and/or texture of the target object to the initial three-dimensional model comprises:

mapping the color and/or texture of the target object to the initial three-dimensional model according to the topology of the texture mapping of the initial three-dimensional model.

12. The generation method according to any one of claims 1 to 5, characterized in that the user input information includes speech and/or text;

the acquiring of the two-dimensional image of the target object according to the user input information includes:

performing semantic recognition on the user input information to acquire first semantic information in the user input information, wherein the first semantic information comprises information of the target object;

and matching in an image library according to the first semantic information to obtain the two-dimensional image of the target object.

13. The generation method according to any one of claims 1 to 5, characterized in that the user input information includes: an image;

acquiring contour information in the image;

and matching in an image library according to the contour information to obtain the two-dimensional image of the target object.

14. The generation method according to any one of claims 1 to 5, characterized in that the generation method further includes:

and adding the target three-dimensional model into the candidate three-dimensional model library to serve as a candidate three-dimensional model in the candidate three-dimensional model library.

15. The generation method according to any one of claims 1 to 5, applied in an educational setting for Augmented Reality (AR) and/or Virtual Reality (VR).

16. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of generating a three-dimensional model according to any one of claims 1 to 15.

17. An apparatus for generating a three-dimensional model, comprising: comprising a processor and a memory for storing a computer program, the processor being adapted to invoke the computer program to perform the method of generating a three-dimensional model according to any of claims 1 to 15.

18. An electronic device, comprising:

means for generating a three-dimensional model according to claim 17; and

an input device comprising at least one of: a voice input device, a character input device and an image input device.