CN117974939A

CN117974939A - Grid model simplification method and device and related equipment

Info

Publication number: CN117974939A
Application number: CN202410141066.6A
Authority: CN
Inventors: 魏榕; 王博远; 刘祥德; 赵飞飞; 严旭; 于金波; 李东
Original assignee: Beijing Digital City Research Center
Current assignee: Beijing Digital City Research Center
Priority date: 2024-01-31
Filing date: 2024-01-31
Publication date: 2024-05-03

Abstract

The disclosure provides a grid model simplification method, a grid model simplification device and related equipment, and relates to the technical field of computer graphics and computer vision, wherein the method comprises the following steps: acquiring N first rendering images of the initial grid model under N camera poses, wherein N is an integer greater than or equal to 1; and performing repeated iterative simplification on the initial grid model according to the N first rendering images to obtain a target grid model, wherein the target grid model is a grid model obtained based on effective iterative simplification in the repeated iterative simplification, visual differences between the N second rendering images and the N first rendering images which are subjected to effective iterative simplification are smaller than or equal to a first preset threshold value, and the visual differences are obtained based on a pre-trained neural network model. The invention can reduce or avoid unnecessary topology errors in the mesh model simplification process and improve the visual effect of the finally obtained target mesh model after rendering.

Description

Grid model simplification method and device and related equipment

Technical Field

The present disclosure relates to the technical field of computer vision and computer graphics, and in particular, to a method and an apparatus for simplifying a grid model, and related devices.

Background

For a complex grid model comprising a large number of triangular patches, after the number of vertexes and the number of polygons in the grid model are reduced by a model simplification technology, the storage space and rendering burden of the grid model can be obviously reduced.

In application, it is found that when the grid model is simplified based on the related technology, unnecessary topology errors are generated in the grid model, and on one hand, the unnecessary topology errors may cause line incoherency of the simplified model, such as occurrence of holes or fracture of joints; on the other hand, the above-mentioned unnecessary topology error may also cause loss of details of the simplified model, such as disappearance of sharp edges or corners.

That is, the model of the mesh model, which is model-simplified based on the related art, has poor visual effect after rendering.

Disclosure of Invention

The invention aims to provide a grid model simplification method, a grid model simplification device and related equipment, which are used for solving the technical problem that the visual effect of a grid model obtained by performing model simplification processing in the related technology is poor after model rendering.

In a first aspect, the present disclosure provides a mesh model simplification method, the method comprising:

Acquiring N first rendering images of the initial grid model under N camera poses, wherein N is an integer greater than or equal to 1;

And performing repeated iteration simplification on the initial grid model according to the N first rendering images to obtain a target grid model, wherein the target grid model is a grid model obtained based on effective iteration simplification in the repeated iteration simplification, the visual difference between the N second rendering images and the N first rendering images which are subjected to effective iteration simplification is smaller than or equal to a first preset threshold value, and the N second rendering images which are subjected to effective iteration simplification correspond to the N camera poses one by one.

In a second aspect, the present disclosure provides a mesh model simplification apparatus, the apparatus comprising:

The acquisition module is used for acquiring N first rendering images of the initial grid model under N camera poses, wherein N is an integer greater than or equal to 1;

the model simplification module is used for carrying out repeated iteration simplification on the initial grid model according to the N first rendering images to obtain a target grid model, wherein the target grid model is a grid model obtained based on effective iteration simplification in the repeated iteration simplification, visual differences between the N second rendering images and the N first rendering images which are subjected to effective iteration simplification are smaller than or equal to a first preset threshold value, the visual differences are obtained based on a pre-trained neural network model, and the N second rendering images and the N camera poses which are subjected to effective iteration simplification are in one-to-one correspondence.

In a third aspect, the present disclosure provides an electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method according to the first aspect.

In a fourth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method according to the first aspect.

In a fifth aspect, the present disclosure provides a computer program product comprising computer instructions which, when executed by a processor, implement the steps of the method as described in the first aspect.

In the method, according to a plurality of rendering images of an initial grid model under different camera poses, repeated iterative simplification is carried out on the initial grid model, a final target grid model is obtained based on effective iterative simplification in the repeated iterative simplification, and meanwhile, the visual difference between N second rendering images and N first rendering images in one iteration is defined to be smaller than or equal to a first preset threshold value for effective iterative simplification, so that the unnecessary topological error in the grid model simplification process is reduced or avoided through iterative comparison of the visual effect difference of the grid model before and after simplification, and the visual effect of the rendering image of the final target grid model is improved.

Drawings

FIG. 1 is a flow diagram of a simplified method of grid model provided by an embodiment of the present disclosure;

FIG. 2 is a flow diagram of another grid model reduction method provided by an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a simplified apparatus for mesh model according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

The following description of the technical solutions in the embodiments of the present disclosure will be made clearly and completely with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments of the present disclosure, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure are intended to be within the scope of this disclosure.

An embodiment of the present disclosure provides a model simplifying method, as shown in fig. 1, including:

Step 101, acquiring N first rendering images of the initial grid model under N camera poses.

Wherein N is an integer greater than or equal to 1.

The initial mesh model is a three-dimensional model including a plurality of patches (also referred to as triangular patches), for example: a mesh model (mesh model) composed of several triangular patches.

The N first rendering images and the N camera poses are in one-to-one correspondence, where the first rendering images may be two-dimensional images obtained by performing image rendering on an initial grid model based on pose parameters corresponding to the camera poses, the first rendering images may also be two-dimensional images of a target object acquired by an image capturing device (such as a camera) at the corresponding camera poses, and the initial grid model is a three-dimensional data representation of the target object, for example, the target object may be a church, a bedroom, a statue, and the like.

And 102, performing repeated iteration simplification on the initial grid model according to the N first rendering images to obtain a target grid model.

The target grid model is a grid model obtained based on effective iterative simplification in the multiple iterative simplification, visual differences between N second rendering images and N first rendering images of the effective iterative simplification are smaller than or equal to a first preset threshold value, the visual differences are obtained based on a pre-trained neural network model, and the N second rendering images of the effective iterative simplification are in one-to-one correspondence with the N camera poses.

In the multiple iteration simplification, if the visual difference between the N second rendered images and the N first rendered images of one iteration is smaller than or equal to a first preset threshold value, the iteration is considered to be one effective iteration simplification; and if the visual difference between the N second rendering images and the N first rendering images of one iteration is larger than a first preset threshold value, the iteration is considered to be an invalid iteration simplification.

In the present disclosure, one iteration action includes: and performing model simplification processing on the grid model to be simplified of the iteration to obtain a simplified model of the iteration, and respectively performing image rendering on the simplified model of the iteration based on the N camera poses to obtain N second rendered images of the iteration.

When one iteration is judged to be simplified by one effective iteration, the simplified model of the iteration is used as a grid model to be simplified or a target grid model of the next iteration; and when one iteration is judged to be invalid iteration simplification, the iteration action is rolled back, namely the grid model to be simplified of the iteration is used as the grid model to be simplified or the target grid model of the next iteration.

It should be understood that the specific value of the first preset threshold may be adaptively selected according to actual requirements, and the specific value of the first preset threshold is not limited in the disclosure.

For example, the disclosed target mesh model may be used within a game scene or within a virtual reality scene.

In one embodiment, the visual difference is used to indicate a human visually perceived image difference.

In this embodiment, whether to retain or delete the model simplifying operation of the current iteration is determined by the image difference perceived by human vision so that the image expression effect in the user vision of the rendered image of the target mesh model after model simplification and the rendered image of the initial mesh model before model simplification are approached.

In one embodiment, the step 101 includes:

Acquiring an initial grid model;

and respectively performing image rendering on the initial grid model based on the preset N camera pose parameters to obtain N first rendered images.

Under the condition that an initial grid model is obtained in advance, image rendering is carried out on the initial grid model through preset camera pose parameters so as to generate a first rendered image corresponding to the camera pose parameters, and then the rendered image obtaining operation on the existing three-dimensional model is completed.

By way of example, the aforementioned image rendering operations may be performed by an application PyRender or an Open3D or other renderer.

In one embodiment, the acquiring N first rendered images of the initial mesh model at N camera poses includes:

Acquiring N first rendering images of a target object under N camera poses;

And geometrically reconstructing the N first rendering images to obtain an initial grid model for indicating the target object.

In this embodiment, the first rendered image is understood to be a real image of the target object under the corresponding camera pose acquired by the image capturing device, and at this time, under the condition that N first rendered images are acquired first, a three-dimensional model (i.e. a target mesh model) of the target object is obtained by using a two-dimensional image of the target object under each camera pose in a geometric reconstruction manner, so as to complete the operation of acquiring the three-dimensional model of the existing two-dimensional image, and adapt to the model simplification flow of the user in the process of physical modeling. The geometric reconstruction scheme can be a reconstruction scheme based on NeRF Mesh, and the geometric reconstruction is realized by inputting N first rendering images to obtain network training, so that the target grid model is finally obtained.

In one embodiment, the step 102 includes:

Model simplification is carried out on the mesh model to be simplified of the nth iteration based on the nth rendering image, so that a simplified model of the nth iteration is obtained, wherein N is a positive integer less than or equal to N, and the mesh model to be simplified of the first iteration is the initial mesh model;

Comparing the visual differences between the N first rendered images and the N second rendered images of the simplified model of the nth iteration to obtain visual difference values of the nth iteration;

taking the simplified model of the nth iteration as a grid model to be simplified of the (n+1) th iteration or taking the simplified model of the nth iteration as the target grid model under the condition that the visual difference value of the nth iteration is smaller than or equal to the first preset threshold;

And under the condition that the visual difference value of the nth iteration is larger than the first preset threshold, taking the mesh model to be simplified of the nth iteration as the mesh model to be simplified of the (n+1) th iteration, or taking the mesh model to be simplified of the nth iteration as the target mesh model.

The N second rendering images of the simplified model of the nth iteration are in one-to-one correspondence with the N camera poses.

In one example, when the N first rendered images are obtained by respectively performing image rendering on the initial mesh model based on the N camera pose parameters, in order to ensure the camera pose consistency between the first rendered images and the second rendered images, after obtaining the simplified model of the nth iteration, continuing to perform image rendering on the simplified model of the nth iteration by using the N camera pose parameters, so as to obtain N second rendered images of the simplified model of the nth iteration.

In another example, when the N first rendered images are the live-action images, in order to ensure the consistency of the camera pose between the first rendered images and the second rendered images, after the simplified model of the nth iteration is obtained, camera pose estimation may be performed on the N first rendered images respectively to obtain camera pose parameters corresponding to each first rendered image, and then image rendering is performed on the simplified model of the nth iteration respectively based on the camera pose parameters corresponding to each first rendered image to obtain the N second rendered images of the simplified model of the nth iteration.

Illustratively, the visual difference value of the nth iteration may be obtained by:

Based on the corresponding relation between the first rendering image and the camera pose and the corresponding relation between the second rendering image and the camera pose, recombining N first rendering images and N second rendering images of the N-th iterative simplified model into N image groups, wherein the N image groups are in one-to-one correspondence with the N camera poses, and each image group comprises a first rendering image and a second rendering image which correspond to the same camera pose;

calculating the visual difference between the first rendering image and the second rendering image in each image group to obtain a group difference value of each image group;

Taking the sum value of the group difference values of each image group in the plurality of image groups as the visual difference value of the nth iteration.

It should be noted that, when N is smaller than N and the visual difference value of the nth iteration is smaller than or equal to the first preset threshold, the simplified model of the nth iteration is used as the mesh model to be simplified of the (n+1) th iteration;

When n=n, and in a case where the visual difference value of the nth iteration is less than or equal to the first preset threshold, the simplified model of the nth iteration is taken as the target mesh model;

similarly, when N is smaller than N and the visual difference value of the nth iteration is larger than the first preset threshold, the mesh model to be simplified of the nth iteration is used as the mesh model to be simplified of the (n+1) th iteration;

When n=n, and in a case where the visual difference value of the nth iteration is greater than the first preset threshold, the mesh model to be simplified of the nth iteration is taken as the target mesh model.

In one embodiment, the model simplification of the mesh model to be simplified for the nth iteration based on the nth first rendering image, to obtain a simplified model for the nth iteration, includes:

Clustering a plurality of normal images of a plurality of patches included in the nth first rendered image to obtain M _n groups of normal class clusters of the nth first rendered image, wherein M _n is a positive integer;

And carrying out iterative edge deletion on the grid model to be simplified of the nth iteration according to the M _n group normal class clusters to obtain a simplified model of the nth iteration.

In the nth rendered image, one of the patches corresponds to a normal map, and the normal map is a result of projection of a normal vector of the corresponding patch onto the 2D camera plane.

In this embodiment, clustering is performed on multiple normal graphs to obtain multiple groups of normal class clusters, that is, multiple similar patch sets are obtained, and then at least two similar patches corresponding to each normal class cluster are combined through iterative edge deletion processing to adapt to priori knowledge of an artificial plane (multiple patches indicating the artificial plane can be represented by fewer patches instead), so that efficient simplification of the artificial plane is achieved, and the number of patches of the target grid model is further reduced.

For example, the clustering operation of the normal maps of the plurality of patches included in the nth rendered image may be performed by means of mean shift.

In one embodiment, the method further includes, before performing multiple iterative simplification on the initial mesh model according to the N first rendered images to obtain a target mesh model:

Respectively carrying out normal map rendering on the initial grid model based on preset N camera pose parameters to obtain a normal map corresponding to each first rendered image, wherein the N camera pose parameters are in one-to-one correspondence with the N camera poses;

Or alternatively

And respectively carrying out normal prediction on the N first rendering images to obtain a normal map corresponding to each first rendering image.

When the N first rendering images are the live-action images, and the initial grid model is obtained by performing geometric reconstruction based on the N first rendering images, performing normal prediction on the N first rendering images respectively so as to skip the rendering operation of the renderer and directly obtain a plurality of normal maps of a plurality of patches included in each first rendering image, thereby accelerating model simplification efficiency.

The normal prediction operation described above may be accomplished, for example, based on a pre-set vision large model (Large Vision Model, LVM) Ominidata model.

When the N first rendering images are obtained by respectively performing image rendering on the initial grid model based on the preset N camera pose parameters, the renderer can be controlled to respectively perform normal map rendering on the initial grid model based on the preset N camera pose parameters, so as to obtain a plurality of normal maps of a plurality of patches included in each first rendering image.

In one embodiment, the performing iterative puncturing on the mesh model to be simplified in the nth iteration according to the M _n -group normal class clusters to obtain a simplified model in the nth iteration includes:

Replacing the rest normal vector values of the M-th group of normal cluster by using the normal vector value of the cluster center of the M-th group of normal cluster to obtain a normal cluster diagram of the M-th group of normal cluster, wherein M is an integer less than or equal to M _n;

Edge detection is carried out on the normal cluster diagram of the m-th group normal cluster, and the edge contour of the m-th hyperplane is determined;

Deleting the surface patch edges positioned in the edge contour of the mth hyperplane in the to-be-deleted grid model of the mth deleted edge to obtain the deleted edge model of the mth deleted edge, wherein the to-be-deleted grid model of the first deleted edge in the nth iteration is the to-be-simplified grid model of the nth iteration;

Comparing the visual differences between the N first rendering images and the N second rendering images of the deleting edge model of the mth deleting edge to obtain a visual difference value of the mth deleting edge;

taking the edge deletion model of the mth edge deletion as an m+1th edge deletion to-be-deleted grid model or taking the edge deletion model of the mth edge deletion as an nth iteration simplification model under the condition that the visual difference value of the mth edge deletion is smaller than or equal to the first preset threshold;

and under the condition that the visual difference value of the mth edge deletion is larger than the first preset threshold value, taking the to-be-deleted edge grid model of the mth edge deletion as an m+1th edge deletion to-be-deleted edge grid model, or taking the to-be-deleted edge grid model of the mth edge deletion as a n-th iteration simplification model.

The N second rendering images of the edge deletion model of the mth edge deletion correspond to the N camera poses one by one.

It should be noted that, based on the normal vector of the cluster center of the m-th group normal cluster, after replacing the normal vectors of the normal maps of the m-th group normal cluster, the m-th group normal cluster can be understood as a hyperplane, and the normal direction of the hyperplane is the cluster center of the m-th group normal cluster, at this time, edge detection is performed on the m-th group normal cluster by using an edge detection operator (such as a Sobel operator), and the edge profile of the m-th hyperplane can be determined.

The process of obtaining the visual difference value of the mth edge deletion can be referred to the description of the process of obtaining the visual difference value corresponding to the nth iteration in the foregoing embodiment, and in order to avoid repetition, the description is omitted here.

It should be noted that, when M is less than M _n and the visual difference value of the mth edge deletion is less than or equal to the first preset threshold, the edge deletion model of the mth edge deletion is used as the to-be-deleted edge grid model of the mth+1th edge deletion;

When m=m _n and the visual difference value of the mth puncturing edge is smaller than or equal to the first preset threshold, the puncturing edge model of the mth puncturing edge is used as a simplified model of the nth iteration;

when M is less than M _n and the visual difference value of the mth edge deletion is greater than the first preset threshold, the mth edge deletion to-be-deleted grid model is used as an mth+1th edge deletion to-be-deleted grid model;

And under the condition that m=m _n and the visual difference value of the mth edge deletion is larger than the first preset threshold, the to-be-deleted edge grid model of the mth edge deletion is used as a simplified model of the nth iteration.

The visual differences between the N second rendered images and the N first rendered images of each iterative simplification in the multiple iterative simplifications are obtained by calculating the block similarity (Learned Perceptual IMAGE PATCH SIMILARITY, LPIPS) of the learnable perceived image based on a preset visual large model.

The set of difference values or visual difference values are LPIPS values output by the visual large model after the corresponding first rendered image and the corresponding second rendered image are input into the middle visual large model.

In some optional embodiments, after the large visual model is obtained, a model training sample may be built in a targeted manner to enhance the computing power of LPIPS of the obtained large visual model in a model simplified application by means of model fine tuning, wherein positive samples in the model training sample are: a rendered image before simplifying the artificial plane, and a rendered image after simplifying the artificial plane; the negative samples in the model training samples are: a rendered image in front of the reduced non-artificial plane and a rendered image behind the reduced artificial plane.

For ease of understanding, examples are illustrated below:

fig. 2 is a flow chart of a simplified model method provided by the present disclosure, where the method is as shown in fig. 2:

step one, acquiring an initial grid model, and marking the initial grid model as M.

And secondly, rendering M under N camera pose by using a renderer to obtain a rendering result (namely, the operation of obtaining a rendering result before the simple mode in fig. 2) under the condition of the texture mapping of M, namely, obtaining N rendering maps (which can be understood as the N first rendering images), wherein the N rendering maps form a set I= { I ₁,I₂,I₃,...,I_N }, and N camera pose parameters corresponding to the N camera pose form a set T= { T ₁,T₂,T₃,...,T_N }.

Thirdly, rendering by using the renderer to obtain N normal map sets corresponding to the N rendering maps, wherein the normal map set of each rendering map is denoted as K _n, clustering K _n by using a mean shift method to obtain M _n class clusters corresponding to K _n, and determining a hyperplane indicated by each of M _n class clusters corresponding to K _n (namely, the operation of acquiring a plane area in FIG. 2) based on the normal map of the cluster center of each of M _n class clusters corresponding to K _n.

And step four, deleting the edge in the m-th hyperplane in the K _n (namely deleting the edge in the operation in fig. 2), and updating the data structure of the to-be-deleted edge grid model of the m-th deleted edge corresponding to the K _n after deleting the edge to obtain the deleted edge model of the m-th deleted edge corresponding to the K _n.

And fifthly, using the same renderer and the same rendering parameters as those in the step two, performing image rendering on the m-th edge deletion model corresponding to the K _n under the camera parameters T to obtain a m-th edge deletion rendering image set corresponding to the K _n (namely, the operation of obtaining a simplified rendering result in FIG. 2).

Step six, calculating LPIPS loss between the m-th edge deletion rendering image set corresponding to the K _n and the I by using a visual large model, and taking the m-th edge deletion model corresponding to the K _n as a to-be-deleted grid model or a target grid model of the next iteration if the m-th LPIPS loss corresponding to the K _n is smaller than or equal to a threshold value; if LPIPS loss of the mth time corresponding to K _n is larger than a threshold value, taking the grid model to be deleted of the mth deleting edge corresponding to K _n as the grid model to be deleted of the next iteration or a target grid model; traversing each hyperplane in the N normal map sets, and executing the actions of the steps three to five in each traversal until the traversal is finished.

It should be appreciated that the termination condition shown in fig. 2 is understood to be that a set number of traversals is reached, which is the sum of the number of hyperplanes in the N normal map sets.

It should be noted that, in some embodiments, the actions of the first step and the second step may be skipped, a plurality of live-action images surrounding a certain object (i.e., the target object) at a plurality of viewing angles are taken as I, and then geometric reconstruction is performed based on the I to obtain an initial mesh model; the camera pose estimation is respectively carried out on a plurality of live-action images, so that a set T can be obtained; at this time, N normal map sets in the third step are obtained by respectively performing normal prediction on a plurality of live-action images by using a large visual model.

According to the grid model simplification method, the simplification loss of each modeling treatment is calculated in the model simplification process by using the visual large model, and the model simplification shrinkage is performed by selecting the edge/point with smaller loss under LVM evaluation, so that the model simplification shrinkage operation with overlarge simplification loss is adaptively skipped, unnecessary topology change is avoided, and the model after model simplification accords with the visual preference and prior of a user; the method comprises the steps of determining a hyperplane by normal clustering and performing edge deletion processing, so that an artificial plane (such as a desktop, a wall surface and a ground) can be better found, and edge/point shrinkage is performed on a specific area in a targeted manner, so that efficient simplification of the artificial plane is realized.

Referring to fig. 3, fig. 3 is a mesh model simplifying apparatus provided in an embodiment of the present disclosure, and as shown in fig. 3, the mesh model simplifying apparatus 300 includes:

An acquiring module 301, configured to acquire N first rendered images of the initial mesh model under N camera poses, where N is an integer greater than or equal to 1;

The model simplification module 302 is configured to perform multiple iterative simplification on the initial mesh model according to the N first rendered images to obtain a target mesh model, where the target mesh model is a mesh model obtained based on effective iterative simplification in the multiple iterative simplification, a visual difference between the N second rendered images and the N first rendered images that are effectively iterative simplified is less than or equal to a first preset threshold, the visual difference is obtained based on a pre-trained neural network model, and the N second rendered images that are effectively iterative simplified are in one-to-one correspondence with the N camera poses.

In one embodiment, the model simplification module 302 includes:

The model simplifying sub-module is used for carrying out model simplification on the mesh model to be simplified of the nth iteration based on the nth rendering image to obtain a simplified model of the nth iteration, wherein N is a positive integer less than or equal to N, and the mesh model to be simplified of the first iteration is the initial mesh model;

The difference comparison sub-module is used for comparing the visual differences between the N first rendered images and the N second rendered images of the simplified model of the nth iteration to obtain a visual difference value of the nth iteration;

A first model determining submodule, configured to use, when the visual difference value of the nth iteration is less than or equal to the first preset threshold, the simplified model of the nth iteration as a mesh model to be simplified of the (n+1) th iteration, or use the simplified model of the nth iteration as the target mesh model;

And the second model simplifying sub-module is used for taking the mesh model to be simplified of the nth iteration as the mesh model to be simplified of the (n+1) th iteration or taking the mesh model to be simplified of the nth iteration as the target mesh model under the condition that the visual difference value of the nth iteration is larger than the first preset threshold value.

In one embodiment, the model simplification sub-module comprises:

The normal clustering unit is used for clustering a plurality of normal images of a plurality of patches included in the nth first rendered image to obtain M _n groups of normal class clusters of the nth first rendered image, wherein M _n is a positive integer;

and the iterative edge deleting unit is used for carrying out iterative edge deleting on the grid model to be simplified of the nth iteration according to the M _n group normal class clusters to obtain a simplified model of the nth iteration.

In one embodiment, the iterative puncturing unit is specifically configured to:

In one embodiment, the model simplification sub-module includes a normal acquisition unit, where the normal acquisition unit is specifically configured to:

Or alternatively

In one embodiment, the acquiring module 301 includes a first acquiring unit, where the first acquiring unit is specifically configured to:

Acquiring an initial grid model;

In one embodiment, the acquiring module 301 includes a second acquiring unit, where the second acquiring unit is specifically configured to:

Acquiring N first rendering images of a target object under N camera poses;

In one embodiment, the visual differences between the N second rendered images and the N first rendered images for each iterative simplification of the plurality of iterative simplifications are obtained by computing a learnable perceived image block similarity LPIPS based on a preset visual large model.

The mesh model simplifying apparatus 300 provided in the embodiments of the present disclosure can implement each process in the embodiments of the mesh model simplifying method, and in order to avoid repetition, a description thereof is omitted here.

According to an embodiment of the disclosure, the disclosure further provides an electronic device, a readable storage medium.

Fig. 4 illustrates a schematic block diagram of an example electronic device 400 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 4, the apparatus 400 includes a computing unit 401 that can perform various appropriate actions and processes according to a computer program stored in a Read-Only Memory (ROM) 402 or a computer program loaded from a storage unit 408 into a random access Memory (Random Access Memory, RAM) 403. In RAM 403, various programs and data required for the operation of device 400 may also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

Various components in device 400 are connected to I/O interface 405, including: an input unit 406 such as a keyboard, a mouse, etc.; an output unit 407 such as various types of displays, speakers, and the like; a storage unit 408, such as a magnetic disk, optical disk, etc.; and a communication unit 409 such as a network card, modem, wireless communication transceiver, etc. The communication unit 409 allows the device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The computing unit 401 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 401 include, but are not limited to, a central Processing unit (Central Processing Unit, CPU), a graphics Processing unit (Graphic Process Unit, GPU), various specialized artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (DIGITAL SIGNAL Processing, DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 401 performs the respective methods and processes described above, such as a mesh model simplifying method. For example, in some embodiments, the grid model reduction method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 400 via the ROM 402 and/or the communication unit 409. When the computer program is loaded into RAM 403 and executed by computing unit 401, one or more steps of the grid model reduction method described above may be performed. Alternatively, in other embodiments, the computing unit 401 may be configured to perform the mesh model reduction method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated Circuit System, field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), application SPECIFIC INTEGRATED Circuit (ASIC), application-specific standard Product (Application SPECIFIC STANDARD Product, ASSP), system-on-a-Chip (SOC), complex Programmable logic device (Complex Programmable Logic Device, CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The term "machine-readable medium" as used herein refers to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.

The embodiment of the present application further provides a computer program product, which includes computer instructions, where the computer instructions, when executed by a processor, implement each process of the method embodiment shown in fig. 1 or fig. 2 and achieve the same technical effects, and are not repeated herein.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A method of mesh model simplification, the method comprising:

And performing repeated iteration simplification on the initial grid model according to the N first rendering images to obtain a target grid model, wherein the target grid model is a grid model obtained based on effective iteration simplification in the repeated iteration simplification, visual differences between the N second rendering images and the N first rendering images which are subjected to effective iteration simplification are smaller than or equal to a first preset threshold value, the visual differences are obtained based on a pre-trained neural network model, and the N second rendering images which are subjected to effective iteration simplification are in one-to-one correspondence with the N camera poses.

2. The method of claim 1, wherein the visual difference is used to indicate a human visually perceived image difference.

3. The method according to claim 1, wherein performing iterative simplification on the initial mesh model according to the N first rendered images for a plurality of times to obtain a target mesh model includes:

4. A method according to claim 3, wherein model simplifying the mesh model to be simplified for the nth iteration based on the nth first rendering image to obtain a simplified model for the nth iteration comprises:

Clustering a plurality of normal images corresponding to the nth first rendered image to obtain M _n groups of normal class clusters corresponding to the nth first rendered image, wherein M _n is a positive integer;

5. The method of claim 4, wherein the performing iterative puncturing on the mesh model to be simplified for the nth iteration according to the M _n normal class clusters to obtain a simplified model for the nth iteration comprises:

replacing the rest normal vector values of the M-th group by using the normal vector value of the clustering center of the M-th group normal cluster to obtain a normal cluster diagram of the M-th group normal cluster, wherein M is a positive integer less than or equal to M _n;

6. The method of claim 4, wherein the performing multiple iterative simplifications on the initial mesh model based on the N first rendered images, before obtaining the target mesh model, further comprises:

Or alternatively

7. The method of any of claims 1-6, wherein the acquiring N first rendered images of the initial mesh model at N camera poses comprises:

Acquiring an initial grid model;

And rendering the initial grid model based on preset N camera pose parameters respectively to obtain N first rendered images, wherein the N camera pose parameters and the N camera poses are in one-to-one correspondence.

8. The method of any of claims 1-6, wherein the acquiring N first rendered images of the initial mesh model at N camera poses comprises:

Acquiring N first rendering images of a target object under N camera poses;

9. A mesh model simplification apparatus, the apparatus comprising:

10. An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor performs the steps of the method according to any one of claims 1 to 8.

11. A computer readable storage medium, characterized in that it has stored thereon a computer program which, when executed by the processor, implements the steps of the method according to any of claims 1 to 8.

12. A computer program product comprising computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 8.