CN115439616B

CN115439616B - Heterogeneous object characterization method based on multi-object image alpha superposition

Info

Publication number: CN115439616B
Application number: CN202211383316.4A
Authority: CN
Inventors: 王炜; 谢超平; 姚仕元; 张琪浩
Original assignee: Chengdu Sobey Digital Technology Co Ltd
Current assignee: Chengdu Sobey Digital Technology Co Ltd
Priority date: 2022-11-07
Filing date: 2022-11-07
Publication date: 2023-02-14
Anticipated expiration: 2042-11-07
Also published as: CN115439616A

Abstract

The invention discloses a heterogeneous object characterization method based on multi-object image alpha superposition, belonging to the field of computer graphics and comprising the following steps: s1, establishing a virtual modeling environment coordinate system, calibrating and calculating camera parameters including internal parameters Kc and external parameters Rc according to the position of a virtual camera, and recording light source information L in the current environment _in And time T _C (ii) a S2, performing layered rendering on the object model in the current virtual modeling environment, and outputting corresponding rendering layers of the models determined by the camera parameters in the step S1; and S3, carrying out multi-layer superposition on each type of object in the visual range of the virtual camera in the scene according to the distance between each layer and the virtual camera in an alpha channel, and finishing the representation of the heterogeneous object. The invention can reduce a large amount of information acquisition such as complex light and shadow, reflection, multiple visual angles and the like caused by multi-object rendering, and greatly reduce computational rendering resources.

Description

Heterogeneous object characterization method based on multi-object image alpha superposition

Technical Field

The invention relates to the field of computer graphics, in particular to a heterogeneous object characterization method based on multi-object image alpha superposition.

Background

The existing high and new video technologies such as free viewpoint video, interactive video, immersive video and the like become hot spots, and the existing object modeling method mainly comprises the following steps: camera array, mesh modeling, point cloud, neural rendering NERF and other methods which have advantages and disadvantages in different scenes.

Scene object reconstruction typically models the entire scene and all objects within it using a rendering engine, which is usually expensive in terms of many and complex objects, shadows, reflections, etc. In some scenes, such as stage shows or scene reconstruction at some specific viewing angles, the scenes usually do not need to obtain full information of the whole scene or all objects, and only need to be visible at a given specific shooting angle. Therefore, a flexible and highly compatible scene representation mode is needed, which can satisfy the most real and liveliest virtual and real combined picture presentation while controlling the manufacturing cost sufficiently low.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a heterogeneous object characterization method based on multi-object image alpha superposition, which can reduce acquisition of a large amount of information such as complex light and shadow, reflection, multiple visual angles and the like caused by multi-object rendering and greatly reduce computational rendering resources and the like.

The purpose of the invention is realized by the following scheme:

a heterogeneous object characterization method based on multi-object image alpha superposition comprises the following steps:

s1, establishing a virtual modeling environment coordinate system, calibrating and calculating camera parameters including internal parameters Kc and external parameters Rc according to the position of a virtual camera, and recording light source information L in the current environment _in And time T _C ；

S2, performing layered rendering on the object model in the current virtual modeling environment, and outputting corresponding rendering layers of the models determined by the camera parameters in the step S1;

and S3, carrying out multi-layer superposition on various types of objects within the visual range of the virtual camera in the scene in an alpha channel according to the distance between the layers and the virtual camera, and finishing the representation of the heterogeneous objects.

Further, in step S2, the substeps of:

s21, inputting the calibrated Kc, rc and the recorded light source information L _in Time T _C ；

S22, performing partial rendering on different model objects under the condition of camera parameter determination;

s23, under the condition that the camera parameters are determined, RGB alpha pictures of i objects in the virtual environment are output according to the current camera view angle and the imaging size, and each picture records the distance D [ i ] from the focal plane.

Further, in step S2, the object model includes a NERF model, a MESH model, and a point cloud model.

Further, in step S3, the objects of each type are all completely opaque static objects and dynamic objects.

Further, in step S3, the performing the multilayer stacking on the α channel includes: alpha output under NERF model, alpha output under Mesh model and alpha output under cloud model.

Further, the α output under the NERF model includes the steps of:

s301, NERF model scene expression, model input, model output, model rendering and view rendering; wherein the scene of the NERF is expressed as

，

For the mapping function expressed for the NERF scene,xis the position information of the three-dimensional space,dis the direction of the viewing angle,x、din order to be of a known quantity,c= (r, g, b) viewing angle dependent 3D dot color,

is the voxel density; the model is input asxAnd d, output iscAnd

(ii) a In model rendering, the camera ray is expressed as

，

Is a camera ray expression function, o is a ray origin, t is a ray distance, and the near-end and far-end boundaries of t aret _n Andt _f (ii) a Ray color integration of

；

Wherein T (T) is an accumulated transparency function,

is the voxel density of the camera ray,

the color of the camera ray is the direction of the camera, and

the threshold range of T (T) is (0~1),

is composed of

S is a discrete point object on T, and is a completely non-transparent object, i.e., T (T) _w ) Has a value of 0,t _w Is the point of the light ray on the surface of the object; in the perspective rendering, when the virtual camera shoots a perspective and the position is determined, namely d is determined, a rendering image at the perspective is output

：

；

S302, outputting an alpha channel: and synthesizing the rendered image under the selected visual angle with the transparent channel alpha, and outputting the transparent channel alpha 1 of the object.

Further, in step S3, the α output under the Mesh model includes the sub-steps of:

s311, model expression of Mesh: definition M = (T) _i ，C _i ) N is a positive integer, and M is a data set of n triangles forming the object，T _i Is a triangular spatial coordinate, T _i =(x _i1 , x _i2, x _i3, y _i1 , y _i2, y _i3, z _i1 , z _i2, z _i3 ) Wherein x is _i ，y _i ，z _i The spatial coordinate positions of three vertexes of the ith triangle in the x axis, the y axis and the z axis respectively are _i1 ，y _i1 ，z _i1 The 1 st vertex of the ith triangle is at the space coordinate position of the x axis, the y axis and the z axis respectively _i2 ，y _i2 ，z _i2 The 2 nd vertex of the ith triangle is at the space coordinate position of the x axis, the y axis and the z axis respectively _i3 ，y _i3 ，z _i3 The 3 rd vertex of the ith triangle is at the spatial coordinate position of the x-axis, the y-axis and the z-axis, C _i The color of the triangle;

s312, α channel output: in the case of camera view and orientation determination, capturing Mesh two-dimensional image information I under the view _m ，I _m =(T _id ，C _id ) Wherein T is _id And C _id Regarding the position information and the color information of the visible triangle under the current visual angle, considering that the object is non-transparent and the blocked triangle information is out of consideration; and synthesizing the two-dimensional image under the specific visual angle of the Mesh with the alpha channel to form a transparent channel alpha 2 of the object.

Further, in step S3, the α output under the point cloud model includes the sub-steps of:

s321, outputting alpha under the point cloud model: the point cloud part is a plurality of discrete points sampled in space, and the model is D = (x) _i ，y _i ，z _i ) I = n, n being the number of sampled points; color definition is carried out on the point cloud model in the x, y and z directions, and a point cloud model D with color information is generated _c =( x _i ，y _i ，z _i C) is r, g and b values in x, y and z directions in a coordinate system;

s322, α channel output: outputting point cloud model D with color information under specific visual angle _c The two-dimensional image of (1), the image being expressible as if the image were transparent regardless of the presence of the object modelD _c =( x _id ，y _id ，z _id ，C _d )，x _id ，y _id ，z _id As positional information at the angle of view, C _d For color information, a 2d picture is obtained from the final composite output.

Further, the step of performing multi-layer superposition on the alpha channel includes: and (3) superposing the background, the NERF model, the Mesh model and the point cloud model on an alpha channel.

Further, in step S22, the different model objects are subjected to partial rendering under camera parameter determination, that is, only model parts within a camera shooting range are rendered.

The beneficial effects of the invention include:

according to the method, the whole scene and all objects in the scene do not need to be modeled by the scene object, and only the 2D image is formed at the view angle of the audience, so that a large amount of information acquisition such as complex light and shadow, reflection, multi-view angle and the like caused by multi-object rendering can be reduced, and the computational rendering resources are greatly reduced. Meanwhile, the method supports various heterogeneous objects such as Mesh modeling, voxel, point cloud, NERF deep learning and the like, does not need to carry out object design and modeling again for adapting to the method, can support various heterogeneous objects existing in the prior art, and has good compatibility and usability.

The technical scheme of the embodiment of the invention can be compatible with various heterogeneous objects such as Mesh modeling, voxels, point clouds, NERF deep learning and the like, and the objects can be designed, collected, reconstructed, represented and rendered. Different objects can be represented by different methods, such as surfaces, voxels, point clouds, deep learning and the like, and under the requirements of the pose and illumination of a unified scene, each object is rendered by the representation method of the object, and a 2D image with a channel is output.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a diagram illustrating a unified abstraction of object hierarchy rendering in an embodiment of the present invention;

FIG. 2 is a schematic representation of a scene representation of alpha output of NERF in an embodiment of the present invention;

FIG. 3 is a schematic view rendering of alpha output of NERF in an embodiment of the present invention;

FIG. 4 is a diagram illustrating a model representation of Mesh for alpha output of Mesh in an embodiment of the present invention;

FIG. 5 is a diagram illustrating an α channel output of an α output of Mesh in an embodiment of the present invention;

FIG. 6 is a schematic diagram of an α channel output of the α output under the point cloud model according to the embodiment of the present invention;

FIG. 7 is a schematic diagram of obtaining a 2d picture according to a final synthesized output according to an embodiment of the present invention;

FIG. 8 is a flowchart illustrating steps of a method according to an embodiment of the present invention.

Detailed Description

All features disclosed in all embodiments in this specification, or all methods or process steps implicitly disclosed, may be combined and/or expanded, or substituted, in any way, except for mutually exclusive features and/or steps.

The technical scheme of the embodiment of the invention can be compatible with various heterogeneous objects such as Mesh modeling, voxels, point clouds, NERF deep learning and the like, and the objects can be designed, collected, reconstructed, characterized and rendered. Different objects can be represented by different methods, such as surfaces, voxels, point clouds, deep learning and the like, and under the requirements of the pose and illumination of a unified scene, each object is rendered by the representation method of the object, and a 2D image with a channel is output.

The embodiment of the invention provides a heterogeneous object characterization method based on alpha superposition of multi-object images, which comprises three steps of viewpoint determination, distributed rendering and alpha superposition, and is shown in fig. 8.

Step 1, establishing a virtual modeling environment coordinate system, calibrating and calculating camera parameters according to the position of a virtual camera, determining a viewpoint, determining three values of camera internal parameters Kc, external parameters Rc, time Tc, light source information Lin and internal parameters Kc to be input aiming at each space point x, y and z and horizontal theta and vertical phi angles, -1,0,1 to be output, and accumulating to reach five dimensions of x, y, z, theta and phi;

step 2, performing layered rendering on the object model in the current virtual modeling environment, and outputting corresponding rendering layers of the respective models determined by the camera parameters in step 1, as shown in fig. 1, specifically including:

step 1), camera parameters (internal reference Kc, external reference Rc, time Tc), light source information Lin, and time t (dynamic object valid) are input.

And step 2), rendering.

And step 3), outputting i RGB alpha picture G [ m ], [ n ] (camera pixel coordinate system) arrays, the distance d [ i ] between each picture record and a focal plane, and the existence of 'penetrating' marked by an effective visual angle flag. In the free shooting area, the upper is not worn theoretically, and if the angle is limited, the presence or absence of the upper needs to be determined.

Step 3, performing multi-layer superposition on the scene, each type of object (static object + dynamic object), the shadow and the mapping in each visual range, taking NERF, mesh and point cloud as examples:

1) Alpha output of NERF: scene is expressed as

，

is the voxel density; the model is input asxAnd d, output iscAnd

as shown in fig. 2.

In model rendering, the camera ray is expressed as

O is the origin of the ray, and t is the proximal and distal boundaries of t _n And t _f . Ray color integration

. Wherein T (T) is an accumulated transparency function:

the threshold range of T (T) is (0~1). Embodiments of the present invention consider an object as a completely non-transparent object, i.e., T (T) _w ) The value of (a) is 0,

is composed of

S is a discrete point object on T, and is a completely non-transparent object, i.e., T (T) _w ) Has a value of 0,t _w Is the point of the light ray on the object surface.

In the perspective rendering, when the virtual camera shoots a perspective and the position is determined, d is determined, and a rendering image under the perspective is output:

；

the image rendered under the specific viewing angle is synthesized with the transparent channel α, and the transparent channel α 1 of the object is output, as shown in fig. 3.

2) α output of Mesh: model representation of Mesh, as shown in figure 4. Mesh is a data structure used in computer graphics to model various irregular objects. Definition M = (T) _i ，C _i ) I = n (M is a data set of n triangles constituting an object)，T _i =(x _i1 , x _i2, x _i3, y _i1 , y _i2, y _i3, z _i1 , z _i2, z _i3 ) Wherein x is _i ，y _i ，z _i For the spatial coordinate position of three vertices of each triangle, x _i1 ，y _i1 ，z _i1 The 1 st vertex of the ith triangle is at the space coordinate position of the x axis, the y axis and the z axis respectively _i2 ，y _i2 ，z _i2 The 2 nd vertex of the ith triangle is at the space coordinate position of the x axis, the y axis and the z axis respectively _i3 ，y _i3 ，z _i3 The 3 rd vertex of the ith triangle is at the spatial coordinate position of the x-axis, the y-axis and the z-axis, C _i The color of the triangle.

Alpha channel output, as shown in FIG. 5, in the case of camera view angle and orientation determination, the Mesh two-dimensional image information I at the view angle is captured _m ，I _m =(T _id ，C _id ) Wherein T is _id And C _id For the position information and the color information of the visible triangle at the current viewing angle, the embodiment of the invention considers that the object is non-transparent, so the information of the blocked triangle is not considered. And synthesizing the two-dimensional image under the specific visual angle of the Mesh with the alpha channel to form a transparent channel alpha 2 of the object.

3) And (3) outputting alpha under a point cloud model: as shown in fig. 6 and 7, the point cloud part is a plurality of discrete points sampled in space, and the model is D = (x) _i ，y _i ，z _i ) I = n, n being the number of sampled points. Because the original data model generated by the point cloud only records the position information of the sampling points, the color definition in the x, y and z directions needs to be carried out on the point cloud model to generate a point cloud model D with color information _c =( x _i ，y _i ，z _i C) is the r, g and b values in the x, y and z directions in the coordinate system.

Outputting alpha channel, outputting point cloud model D with color information under specific visual angle _c Is detected. Since transparency is not considered to exist in the object model, the image can be expressed as D _c =( x _id ，y _id ，z _id ，C _d ) (i.e., the position and color information at that viewing angle) a 2d picture is obtained from the final composite output.

As shown in fig. 7, the right objects (1) (2) (3) represent different model types, i.e. NERF model, MESH model, point cloud model (including but not limited to the above models), respectively, (4) represent the background, and the four objects are superimposed on the α channel to form a top view on the left.

Example 1

and S3, carrying out multi-layer superposition on each type of object in the visual range of the virtual camera in the scene according to the distance between each layer and the virtual camera in an alpha channel, and finishing the representation of the heterogeneous object.

Example 2

On the basis of embodiment 1, in step S2, the method includes the sub-steps of:

Example 3

On the basis of embodiment 1, in step S2, the object model includes a NERF model, a MESH model, and a point cloud model.

Example 4

On the basis of embodiment 1, in step S3, all the types of objects are completely opaque static objects and dynamic objects.

Example 5

On the basis of embodiments 1 to 4, in step S3, the performing multilayer superposition on the α channel includes: alpha output under NERF model, alpha output under Mesh model and alpha output under cloud model.

Example 6

On the basis of embodiment 5, further, the α output under the NERF model includes the steps of:

s301, NERF model scene expression, model input, model output, model rendering and view rendering; wherein the scene of NERF is expressed as

，

is the voxel density; the model is input asxAnd d, output iscAnd

(ii) a In model rendering, the camera ray is expressed as

，

；

Wherein T (T) is an accumulated transparency function,

is the voxel density of the camera ray,

the color of the camera ray is the direction of the camera, and

the threshold range of T (T) is (0~1),

is composed of

S is a discrete point object on T, and is a completely non-transparent object, i.e., T (T) _w ) Has a value of 0,t _w Is the point of the light ray on the surface of the object; in the view angle rendering, when the virtual camera shooting view angle and the position are determined, namely d is determined, a rendering image under the view angle is output

：

；

Example 7

On the basis of embodiment 5, in step S3, the α output under the Mesh model includes the sub-steps of:

s311, model expression of Mesh: definition M = (T) _i ，C _i ) I ranges from 1 to n, n is a positive integer, and M is a groupData set of n triangles of an object, T _i Is a spatial coordinate of a triangle, T _i =(x _i1 , x _i2, x _i3, y _i1 , y _i2, y _i3, z _i1 , z _i2, z _i3 ) Wherein x is _i ，y _i ，z _i The spatial coordinate positions of three vertexes of the ith triangle in the x axis, the y axis and the z axis respectively are _i1 ，y _i1 ，z _i1 The 1 st vertex of the ith triangle is at the space coordinate position of the x axis, the y axis and the z axis respectively _i2 ，y _i2 ，z _i2 The 2 nd vertex of the ith triangle is at the space coordinate position of the x axis, the y axis and the z axis respectively _i3 ，y _i3 ，z _i3 The 3 rd vertex of the ith triangle is at the spatial coordinate position of the x-axis, the y-axis and the z-axis, C _i The color of the triangle;

Example 8

On the basis of the embodiment 5, in step S3, the α output under the point cloud model includes the sub-steps of:

s321, alpha output under the point cloud model: the point cloud part is a plurality of discrete points sampled in space, and the model is D = (x) _i ，y _i ，z _i ) I = n, n being the number of sampled points; color definition is carried out on the point cloud model in the x, y and z directions, and a point cloud model D with color information is generated _c =( x _i ，y _i ，z _i C) is r, g and b values in x, y and z directions in a coordinate system;

s322, α channel output: outputting dots with color information at specific viewing anglesCloud model D _c The two-dimensional image of (1), the image being expressible as D regardless of the transparency of the object model _c =( x _id ，y _id ，z _id ，C _d )，x _id ，y _id ，z _id As position information at the viewing angle, C _d For color information, a 2d picture is obtained from the final composite output.

Example 9

On the basis of embodiment 5, the performing multilayer superposition on the α channel includes the steps of: and (3) superposing the background, the NERF model, the Mesh model and the point cloud model on the alpha channel.

Example 10

On the basis of the embodiment 2, the partial rendering under the camera parameter determination is performed on different model objects, namely, only the model part in the shooting range of the camera is rendered.

The units described in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations described above.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiment; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs, which when executed by one of the electronic devices, cause the electronic device to implement the method described in the above embodiments.

The parts not involved in the present invention are the same as or can be implemented using the prior art.

The above-described embodiment is only one embodiment of the present invention, and it will be apparent to those skilled in the art that various modifications and variations can be easily made based on the application and principle of the present invention disclosed in the present application, and the present invention is not limited to the method described in the above-described embodiment of the present invention, so that the above-described embodiment is only preferred, and not restrictive.

Other embodiments than the above examples may be devised by those skilled in the art based on the foregoing disclosure, or by adapting and using knowledge or techniques of the relevant art, and features of various embodiments may be interchanged or substituted and such modifications and variations that may be made by those skilled in the art without departing from the spirit and scope of the present invention are intended to be within the scope of the following claims.

Claims

1. A heterogeneous object characterization method based on multi-object image alpha superposition is characterized by comprising the following steps:

2. The method for characterizing heterogeneous objects based on multi-object image alpha overlay according to claim 1, comprising in step S2 the sub-steps of:

s21, inputting the calibrated Kc, rc and the recorded light source information L _in Time of dayT _C ；

3. The method for characterizing heterogeneous objects based on alpha overlay of multi-object images according to claim 1, wherein in step S2, the object model comprises a NERF model, a MESH model, a point cloud model.

4. The method for characterizing heterogeneous objects based on alpha superposition of multi-object images according to claim 1, wherein in step S3, each type of object is a static object and a dynamic object that are completely opaque.

5. The method for characterizing heterogeneous objects based on alpha superposition of multiple object images according to claim 1~4 wherein in step S3, said performing multiple layer superposition on the alpha channel comprises the steps of: alpha output under NERF model, alpha output under Mesh model and alpha output under cloud model.

6. The method for characterizing heterogeneous objects based on alpha overlay of multi-object images according to claim 5, wherein the alpha output under the NERF model comprises the steps of:

，

For the mapping function expressed for the NERF scene,xis the position information of the three-dimensional space,dwhich is the direction of the angle of view,x、din order to be of a known quantity,c= (r, g, b) viewing angle dependent 3D dot color,

is the voxel density; the model is input asxAnd d, output iscAnd

(ii) a In model rendering, the camera ray is expressed as

，

Is a camera ray expression function, o is a ray origin, t is a ray distance, and the near-end and far-end boundaries of t aret _n Andt _f (ii) a Ray color integration

；

Wherein T (T) is an accumulated transparency function,

is the voxel density of the camera ray,

the color of the camera ray is the direction of the camera, and

the threshold range of T (T) is (0~1),

is composed of

Is dispersed inInformation, s is a discrete point object on T, and is a completely opaque object, i.e., T (T) _w ) Has a value of 0,t _w Is the point of the ray on the object surface; in the view angle rendering, when the virtual camera shooting view angle and the position are determined, namely d is determined, a rendering image under the view angle is output

：

7. The method for characterizing heterogeneous objects based on alpha superposition of multi-object images according to claim 5, wherein in step S3, the alpha output under the Mesh model comprises the following sub-steps:

s311, model expression of Mesh: definition M = (T) _i ，C _i ) I ranges from 1 to n, n is a positive integer, M is a data set of n triangles forming the object, and T _i Is a triangular spatial coordinate, T _i =(x _i1 , x _i2, x _i3, y _i1 , y _i2, y _i3, z _i1 , z _i2, z _i3 ) Wherein x is _i ，y _i ，z _i The spatial coordinate positions of three vertexes of the ith triangle in the x axis, the y axis and the z axis respectively are _i1 ，y _i1 ，z _i1 The 1 st vertex of the ith triangle is respectively the spatial coordinate position of the x axis, the y axis and the z axis _i2 ，y _i2 ，z _i2 The 2 nd vertex of the ith triangle is respectively the space coordinate position of the x axis, the y axis and the z axis _i3 ，y _i3 ，z _i3 The spatial coordinate position of the 3 rd vertex of the ith triangle in the x-axis, the y-axis and the z-axis, C _i The color of the triangle;

8. The method for characterizing heterogeneous objects based on alpha superposition of multi-object images according to claim 5, wherein in step S3, the alpha output under the point cloud model comprises the following sub-steps:

s321, outputting alpha under the point cloud model: the point cloud part is a plurality of discrete points sampled in space, and the model is D = (x) _i ，y _i ，z _i ) I = n, n being the number of points sampled, x _i ，y _i ，z _i The spatial coordinate positions of three vertexes of the ith triangle in the x axis, the y axis and the z axis are respectively; color definition is carried out on the point cloud model in the x, y and z directions, and a point cloud model D with color information is generated _c =( x _i ，y _i ，z _i C) is the r, g and b values in the x, y and z directions in a coordinate system;

s322, α channel output: outputting a point cloud model D with color information under a specific visual angle _c The two-dimensional image of (1), the image being expressible as D regardless of the transparency of the object model _c =(x _id ，y _id ，z _id ，C _d ) Wherein x is _id ，y _id ，z _id Respectively, position information at the viewing angle, C _d For color information, a 2d picture is obtained from the final composite output.

9. The method for characterizing the heterogeneous object based on the alpha superposition of the multi-object image according to claim 5, wherein the multi-layer superposition in the alpha channel comprises the steps of: and (3) superposing the background, the NERF model, the Mesh model and the point cloud model on the alpha channel.

10. The method for characterizing the heterogeneous objects based on the alpha superposition of the multi-object images according to claim 2, wherein in step S22, the different model objects are partially rendered under the determination of the camera parameters, i.e. only the model parts within the shooting range of the camera are rendered.