WO2023272531A1

WO2023272531A1 - Image processing method and apparatus, device, and storage medium

Info

Publication number: WO2023272531A1
Application number: PCT/CN2021/103290
Authority: WO
Inventors: 杨铀; 蒋小广; 刘琼
Original assignee: Oppo广东移动通信有限公司
Priority date: 2021-06-29
Filing date: 2021-06-29
Publication date: 2023-01-05
Also published as: CN117730530A

Abstract

Provided are an image processing method and apparatus, a device, and a storage medium. The method comprises: according to the depth of a first pixel point in a depth map under a first reference viewpoint, performing region division on the depth map, and obtaining at least one region (101); inversely transforming the coordinates of an m-th second pixel point of a view to be rendered under a target viewpoint into at least one target region among the at least one region, and obtaining a presence position point of the m-th second pixel point in the at least one target region (102), M being greater than 0 and less than or equal to the total number of pixel points in the view to be rendered; and rendering the color of the m-th second pixel point according to the presence position point of the m-th second pixel point in the at least one target region (103).

Description

Image processing method and device, equipment, storage medium

technical field

The embodiments of the present application relate to image technologies, and relate to but are not limited to image processing methods, devices, equipment, and storage media.

Background technique

In applications such as virtual reality, virtual simulation, and immersive remote video conferencing, it is often necessary to synthesize views at arbitrary viewpoints based on known images, that is, synthetic views. The quality of the composite view directly affects the user's experience with the application.

Contents of the invention

The image processing method, device, equipment, and storage medium provided in the embodiments of the present application are implemented as follows:

The image processing method provided by the embodiment of the present application includes: according to the depth of the first pixel in the depth map (Depth Map) under the first reference viewpoint, performing region division on the depth map to obtain at least one region; The coordinates of the mth second pixel point of the view to be rendered are inversely transformed into at least one target area in the at least one area, and the coordinates of the mth second pixel point in the at least one target area are obtained. There is a position point; m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered; according to the existence position point of the mth second pixel point in the at least one target area, render the mth second pixel point The color of the second pixel.

The image processing device provided in the embodiment of the present application includes: a region division module, configured to perform region division on the depth map according to the depth of the first pixel point in the depth map under the first reference viewpoint to obtain at least one region; coordinates An inverse transformation module, configured to inversely transform the coordinates of the mth second pixel point of the view to be rendered under the target viewpoint into at least one target area in the at least one area, to obtain the mth second pixel point Existing position points in the at least one target area; wherein, m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered; the rendering module is configured to place the mth second pixel in the For at least one position point in the target area, render the color of the mth second pixel point.

The electronic device provided by the embodiment of the present application includes a memory and a processor, the memory stores a computer program that can run on the processor, and the processor implements the steps in the image processing method when executing the program.

The computer-readable storage medium provided by the embodiment of the present application stores a computer program thereon, and when the computer program is executed by a processor, the steps in the image processing method are implemented.

In the embodiment of the present application, the depth map is divided into regions according to the depth of the first pixel in the depth map under the first reference viewpoint, instead of dividing the pixels under the first reference viewpoint based on a predetermined plane distribution law. The view (Viewport) is divided into planes; in this way, because the area division combines the depth of each point in the actual scene, the color of the second pixel in the final rendering is more accurate, so that after the view to be rendered is rendered (that is, the synthetic view ) has more image detail.

Description of drawings

The accompanying drawings here are incorporated into the specification and constitute a part of the specification. These drawings show embodiments consistent with the application, and are used together with the description to describe the technical solution of the application.

Fig. 1 is a schematic diagram of the implementation flow of the image processing method of the embodiment of the present application;

FIG. 2 is a schematic flow diagram of another implementation of the image processing method of the embodiment of the present application;

FIG. 3 is a schematic flow diagram of another implementation of the image processing method of the embodiment of the present application;

Fig. 4 is the schematic diagram of the multiplane image (Multiplane Image, MPI) representation of the embodiment of the present application that is composed of 4 plane layers combined with basis functions;

Fig. 5 is the workflow of the synthesis model NeX of scene new angle of view of the embodiment of the present application (the process of obtaining H _n (v) is omitted);

FIG. 6 is a schematic diagram of a standard inverse homography transformation taking plane number D=3 as an example in the embodiment of the present application;

Fig. 7 is a schematic diagram of an example of an MPI plane layer;

Fig. 8 is the schematic workflow diagram of the PMIP (Patch Multiplane Image) model of the embodiment of the present application;

Fig. 9 is the schematic flow sheet that the embodiment of the present application obtains PMPI shape;

Fig. 10 is a schematic diagram of PMPI rendering in which the area number A=2 and the depth number is 4 in the embodiment of the present application;

FIG. 11 is a schematic diagram of a calculation flow chart of a depth map according to an embodiment of the present application;

Figure 12 is a schematic diagram of the comparison of the synthesis effect in the fern scene;

Figure 13 Schematic diagram of the comparison of the synthesis effect in the trex scene;

FIG. 14 is a schematic structural diagram of an image processing device according to an embodiment of the present application;

FIG. 15 is a schematic diagram of a hardware entity of an electronic device according to an embodiment of the present application;

FIG. 16 is a schematic diagram of another hardware entity of the electronic device according to the embodiment of the present application.

detailed description

In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the specific technical solutions of the present application will be further described in detail below in conjunction with the drawings in the embodiments of the present application. The following examples are used to illustrate the present application, but not to limit the scope of the present application.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.

In the following description, references to "some embodiments" describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or a different subset of all possible embodiments, and Can be combined with each other without conflict.

It should be pointed out that the term "first\second\third" involved in the embodiment of this application is to distinguish similar or different objects, and does not represent a specific ordering of objects. Understandably, "first\second\third The specific order or sequence of "three" can be interchanged where permitted so that the embodiments of the application described herein can be practiced in other orders than those illustrated or described herein.

An embodiment of the present application provides an image processing method, which is applied to an electronic device, and the electronic device may be any device capable of data processing, for example, the electronic device is a notebook computer, a mobile phone, a server, a TV, or a projector.

Fig. 1 is a schematic diagram of the implementation flow of the image processing method of the embodiment of the present application. As shown in Fig. 1, the method may include the following steps 101 to 103:

Step 101, according to the depth of the first pixel in the depth map under the first reference viewpoint, perform region division on the depth map to obtain at least one region.

It should be noted that there is no limit to the area division range of the depth map. In some embodiments, a certain block or several blocks of the depth map can be divided into the area; in other embodiments, The entire depth map can be divided into regions.

The number of divided regions may or may not be a specific number. If it is not a specific number, the number of divided regions is related to the actual scene, that is, related to the depth distribution of the first pixel in the depth map.

In the embodiment of the present application, no limitation is imposed on the method for obtaining the depth map. For example, it can be based on binocular stereo vision, that is, two images of the same scene are simultaneously acquired through two cameras carried by an electronic device at a certain distance, and the corresponding pixels in the two images are found through a stereo matching algorithm, and then calculated according to the triangulation principle The disparity information can be obtained, and the disparity information can be used to represent the depth information of objects in the scene through conversion. As another example, the acquisition of the depth information of the scene is realized through the active ranging sensor carried by the electronic device; wherein, the active ranging sensor may be, for example, a time of flight (Time of flight, TOF) camera, a structured light device, or a laser radar. For another example, the electronic device may also obtain the depth map under the first reference viewpoint through steps 201 to 203 of the following embodiment, that is, obtain the depth under the first reference viewpoint based on at least one view under the second reference viewpoint Figure; where the first reference viewpoint is different from the second reference viewpoint. Compared with the method of binocular stereo vision and active ranging sensor, this method does not require the electronic device to have a binocular camera, nor does it need to have an active ranging sensor to obtain a depth map, so that the image processing provided by the embodiment of the present application The method can be applied to more electronic devices, and its universality is stronger.

In some embodiments, the code stream may also be decoded by receiving the code stream sent by the encoding end, so as to obtain the depth map under the first reference viewpoint.

In the embodiment of the present application, there is no limitation on the area division method. The depth map or a block or blocks in the depth map may be divided into regions according to the depth relationship between the first pixels. For example, divide pixels with equal depth into the same area. In another example, pixels with depth differences within a specific range are divided into the same area.

Step 102: Inversely transform the coordinates of the mth second pixel point of the view to be rendered under the target viewpoint into at least one target area in the at least one area, and obtain the mth second pixel point in the Existing position points in at least one target area; wherein, m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered.

It should be noted that the at least one target area may be all areas in the at least one area, or one or more areas; when it is all areas, there is no need to screen the at least one area, and the area The at least one divided area may be used as the target area.

When the at least one target area is one or more of them, in some embodiments, any area in the at least one area can be used as the target area, for example, a specific number of areas are randomly selected as the target area; In some other embodiments, an area satisfying a specific condition may also be selected from the at least one area as the target area. For example, an area whose number of pixels is larger than a specific number is used as a target area; another example is an area whose depth is smaller than a specific depth is used as a target area.

In some embodiments, the homogeneous coordinates (u _t ,v _t ,1) of the second pixel point of the view to be rendered under the target viewpoint can be inversely transformed into the target area by the following formula (1), so as to obtain the pixel The homogeneous coordinates (u _s ,v _s ,1) of the point in the target area under the first reference viewpoint:

In the formula, ~ means equal in a certain proportion, and R and t are the rotation matrix and translation vector from the camera coordinate system of the first reference viewpoint to the camera coordinate system of the target viewpoint. a is the negative value of the region depth of the target region. If the depths of the pixels in the same target region are not equal, the average or median depth of the pixels in the region can be used as the depth of the region. n=(0,0,1) is the unit normal vector of the MPI plane in the camera coordinate system of the first reference viewpoint. k _s and k _t are camera internal parameters corresponding to the first reference viewpoint and the target viewpoint respectively.

Step 103: Render the color of the mth second pixel according to the position of the mth second pixel in the at least one target area.

In some embodiments, step 103 may be implemented through

steps

210 and 211 of the following embodiments.

In the embodiment of the present application, the depth map is divided into regions according to the depth of the first pixel in the depth map under the first reference viewpoint, instead of dividing the pixels under the first reference viewpoint based on a predetermined plane distribution law. The view is plane-divided; in this way, since the area division combines the depth of each point in the actual scene, the color of the second pixel in the final rendering is more accurate, and then the image of the view to be rendered after being rendered (that is, the synthetic view) More details.

The embodiment of the present application further provides an image processing method. FIG. 2 is a schematic diagram of the implementation process of the image processing method in the embodiment of the present application. As shown in FIG. 2, the method may include the following steps 201 to 211:

Step 201, perform three-dimensional reconstruction on the included scene according to at least one view under the second reference viewpoint, and obtain point cloud data of the scene under the camera coordinate system of the first reference viewpoint.

In some embodiments, the sparse view of the scene can be used as the input of the colmap tool to perform camera parameter estimation and multi-dimensional stereo reconstruction (MVS) from motion to structure (SFM), thereby obtaining the point cloud data; wherein, the point cloud data The coordinates of the midpoint are denoted as (x,y,d), where d represents the depth of the point relative to the camera of the first reference viewpoint.

It should be noted that the sparse views of the scene are views under different second reference viewpoints.

Step 202, determining the disparity map of the scene.

In the embodiment of the present application, there may be various methods for determining the disparity map. For example, the method based on binocular stereo vision mentioned above. As another example, in some embodiments, the electronic device may implement step 202 in the following way: Obtain a transparency map of at least one plane of the scene according to at least one view under the second reference viewpoint; and the corresponding plane depth to synthesize the disparity map of the scene; thus, compared to the method based on binocular stereo vision, electronic devices without binocular cameras can still implement the image processing method, so its universality Stronger, and save the hardware cost of electronic equipment.

For example, using the at least one view under the second reference viewpoint as input, the MPI representation of the scene is synthesized through the NeX model, and based on this, the disparity map is synthesized according to the following formula (2)

Among them, d _i represents the depth of the i-th MPI plane (sorted from far to near), and α _i represents the transparency of the i-th flat MPI plane. D represents the number of transparency maps.

Step 203, obtain a depth map under the first reference viewpoint according to the disparity map and the point cloud data.

In some embodiments, the electronic device can implement step 203 in this way: according to the disparity map and the point cloud data, obtain the inverse proportional coefficient between the disparity map and the depth map; according to the inverse proportional coefficient and the disparity map , to obtain the depth map under the first reference viewpoint.

It can be understood that the disparity is inversely proportional to the depth. Therefore, the inverse proportion coefficient can be determined first, and then the disparity map can be converted into a depth map based on the coefficient.

For the determination method of the inverse proportional coefficient, for example, it can be calculated according to the following formula (3):

Among them, the inverse proportional coefficient is recorded as σ, P _s is the point cloud data, and (x, y, d) are the coordinates of the point in the camera coordinate system of the first reference viewpoint.

Step 204, according to the depth of the first pixel in the depth map, determine the depth relationship between the first pixels;

Step 205, according to the depth relationship, perform region division on the depth map to obtain at least one region.

In some embodiments, the first pixel points with the same depth or a depth difference within a specific range are divided into the same area. The region division can be realized by using the OTSU algorithm or the superpixel segmentation algorithm.

Step 206, determining the transformation relationship between the camera coordinate system where the first reference viewpoint is located and the camera coordinate system where the target viewpoint is located.

In some embodiments, the transformation relationship includes a rotation matrix and a translation vector.

Step 207, acquiring the internal camera parameters corresponding to the first reference viewpoint and the internal camera parameters corresponding to the target viewpoint;

Step 208, determine the area depth of at least one target area in the at least one area.

In some embodiments, when the depths of pixels in the same target area are different, the mean or median depth of the pixels in the area can be used as the area depth of the area; for the same depth of pixels in the same target area In the case of , the depth of any pixel in the area can be used as the area depth of the area.

Step 209: Perform a reverse homography on the homogeneous coordinates of the mth second pixel point according to the transformation relationship, the camera intrinsic parameters corresponding to the first reference viewpoint and the target viewpoint, and the region depth transform to obtain the existing location point of the mth second pixel point in the at least one target area.

Step 210, from the existing position points of the mth second pixel point in the at least one target area, select the position points satisfying the condition as valid position points.

In some embodiments, if the existing location point is in the corresponding area, the location point is regarded as a valid location point; otherwise, if the existing location point is not in the corresponding area, it is regarded as an invalid location point and discarded.

Step 211: Render the color of the mth second pixel according to the effective position point.

In some embodiments, step 211 can be implemented as follows: determine the color coefficient, transparency, basic color value and basis function of the effective position point; wherein, the independent variable of the basis function is the effective position point and the target The relative direction of the viewpoint; according to the color coefficient, basic color value and basis function of the effective position point, the observed color value of the effective position point from the relative direction is obtained; the transparency of each effective position point is Combining with the observed color value to obtain a composite color value; using the composite color value to render the color of the mth second pixel. Further, in some embodiments, the transparency and color coefficient of the effective location point can be determined through step 304 of the following embodiment; the basic color value of the effective location point can be determined through step 305 of the following embodiment; through the following embodiment Step 306 of determining the basis functions of the effective location points.

In some embodiments, the relative direction may be a unit direction vector of the target viewpoint relative to the effective location point, or may be a unit direction vector of the effective location point relative to the target viewpoint.

It can be understood that, in the embodiment of the present application, effective position points are first selected from the existing position points of the mth second pixel point in the at least one target area, and then based on the effective position points instead of each Once there is a position point, render the color of the mth second pixel point; in this way, the amount of calculation can be saved, thereby improving rendering efficiency, and further improving the synthesis efficiency of the synthesized view.

The embodiment of the present application further provides an image processing method. FIG. 3 is a schematic diagram of the implementation process of the image processing method in the embodiment of the present application. As shown in FIG. 3 , the method may include the following steps 301 to 309:

Step 301, divide the depth map into regions according to the depth of the first pixel in the depth map under the first reference viewpoint, and obtain at least one region;

Step 302: Inversely transform the coordinates of the mth second pixel point of the view to be rendered under the target viewpoint into at least one target area in the at least one area, and obtain the mth second pixel point in the at least one target area. Existing position points in at least one target area; wherein, m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered;

Step 303, from among the existing position points of the mth second pixel point in the at least one target area, select the position points satisfying the conditions as valid position points;

Step 304: Obtain the transparency and color coefficient of the effective location point according to the coordinates of the effective location point and the trained first multi-layer perceptron.

In some embodiments, the coordinates of the effective position point can be mapped to a vector with the first dimension; the vector with the first dimension is input into the first multi-layer perceptron to obtain the effective position The transparency and color factor of the point.

In the embodiment of the present application, there is no limitation on the size of the first dimension, which may be 56 dimensions or any dimension.

Further, the mapping to the space coordinates (x, y, d) of the effective position point can be realized by the following formula (4):

Step 305, according to the coordinates of the effective location point, obtain the basic color value of the effective location point;

Step 306 , according to the relative direction and the trained second multi-layer perceptron, obtain the basis function of the effective location point.

In some embodiments, the relative direction is mapped to a vector with a second dimension; the vector with the second dimension is input into the second multi-layer perceptron to obtain the basis function of the effective position point .

For example, substituting v _x and v _y in the unit direction vector v=(v _x , v _y , v _z ) of the effective position point relative to the target viewpoint into the above formula (4), so as to obtain a vector with the second dimension . The size of the second dimension can be arbitrary, and the value of h can also be set arbitrarily.

Step 307: Obtain the color value of the effective location point viewed from the relative direction according to the color coefficient, the basic color value and the basis function of the effective location point.

In some embodiments, the color value C ^P (v) of the effective position point P observed from the relative direction v can be obtained according to the following formula (5):

Wherein, v represents the unit direction vector of the point P relative to the target viewpoint, H _n (v) represents the basis function related to v, for example, the number of basis functions N=8. k ₀ ^P represents the basic color value of point P (such as RGB value, of course, it is not limited to this color format, and can also be expressed in other color formats), [k ₁ ^P ,...,k _N ^P ] represents the color of point P coefficient. [k ₀ ^P ,k ₁ ^P ,...,k _N ^P ] is only related to the coordinates of point P, and has nothing to do with v.

Step 308 , combining the transparency of each effective position point with the observed color value to obtain the synthesized color value of the mth second pixel point.

For example, the composite color value C _t of the mth second pixel can be calculated according to the following formula (6):

Among them, i represents the i-th effective position point, D represents the total number of effective position points, C _i represents the color value of the effective position point observed from the v direction, and α _i represents the transparency of the effective position point.

Step 309, using the composite color value to render the color of the mth second pixel.

In some embodiments, the method further includes: obtaining a synthetic view under the target viewpoint after the color of each second pixel of the view to be rendered is rendered; obtaining a real view under the target viewpoint ; According to the synthetic view and the real view, a synthetic loss is obtained; according to the synthetic loss, update the parameter values of the first multi-layer perceptron and the second multi-layer perceptron; thus, the first multi-layer perceptron The results obtained by the layer perceptron and the second multi-layer perceptron are more accurate, so that the next time a similar scene is synthesized from a new view, a synthetic view with better image quality can be obtained.

In some embodiments, the composite loss can be calculated according to the following formula (7):

L=L _rec +γTV(k ₀ ) (7);

Wherein, L _rec can be calculated according to the following formula (8); TV(k ₀ ) is the total variation loss of the regularization term, and γ represents the coefficient of the regularization term.

in,

refers to the synthetic view, I refers to the real view, and ω is the balancing weight.

The above image processing method can be applied to the online use stage, and can also be applied to the offline training stage. For the offline training phase, in some embodiments, the method further includes: using the updated first multi-layer perceptron and the updated second multi-layer perceptron to re-render the mth second pixel color, until the obtained composite loss meets the condition or the number of updates meets the condition, and the first multilayer perceptron and the second multilayer perceptron that can be used in the online use stage are obtained.

As mentioned above, the depth map under the first reference viewpoint can be obtained through various methods. For example, the depth map can be obtained by decoding the code stream sent by the encoding end. Correspondingly, for the encoding method of the encoding end, in some embodiments, the encoding device can, according to the depth of the first pixel in the depth map under the first reference viewpoint, The depth map is divided into regions to obtain at least one region; then, the total number of regions obtained by division and the depth map are encoded to generate a code stream; thus, at the decoding end, the decoding device can obtain the total number of regions and the depth map obtained by decoding the code stream. The depth map, and then transmit these information to the image processing device, and the image processing device divides the depth map into regions according to the total number of regions, so as to obtain the at least one region, and then perform other content as in the above image processing method, And then obtain the synthesized view; and, transmit the synthesized view to the display device for image display or play.

In other embodiments, for the encoding method at the encoding end, in other embodiments, the encoding device can encode the depth map to generate a code stream; thus, at the decoding end, the decoding device can obtain the depth map by decoding the code stream, Then the depth map is transmitted to the image processing device, and the image processing device performs region division on the depth map according to a specific region division algorithm, so as to obtain the at least one region, and then execute other content as in the above image processing method, and then obtain synthetic view; and, transmitting the synthetic view to a display device for image display or playback.

An exemplary application of the embodiment of the present application in an actual application scenario will be described below.

The scene new perspective synthesis model NeX is based on MPI and basis function (Basis function) to obtain better scene new perspective rendering results. The NeX model modifies the color frame of MPI to add an effect that changes with the viewing angle for the color frame of MPI. The characterization of MPI combined with basis functions is shown in Figure 4, where the basis functions are combined on the RGB values of the color map, and the combination method is shown in the following formula (9):

Among them, P represents the point coordinate position in MPI under the space coordinate system with the first reference viewpoint as the origin (hereinafter referred to as point P), v represents the unit direction vector of point P relative to the target viewpoint, and C ^P (v) represents point P is the color value (RGB format) observed from the v direction. H _n (v) represents the basis function related to v, and the number of basis functions is N=8. k ₀ ^P represents the underlying RGB value of point P, which is equivalent to the color value in the original MPI. [k ₁ ^P ,...,k _N ^P ] represents the RGB coefficients of point P. [k ₀ ^P ,k ₁ ^P ,...,k _N ^P ] is only related to the coordinates of point P, and has nothing to do with v.

The NeX model takes as input a sparse view of the scene and can output new views around the input viewpoint. The overall flow of the NeX model is shown in Figure 5:

First, for the spatial coordinates (x, y, d) and the unit direction vector v=(v _x , v _y , v _z ) of point P, use the following formula (10) to perform position encoding on them respectively to obtain the corresponding position encoding vector:

Normalize x,y,d to [-1,1] range respectively. Among them, x and y are respectively mapped to 20-dimensional vectors (h is set to 10) through formula 2, and d is mapped through formula 2

is a 16-dimensional vector (h is set to 8). The three vectors are sequentially spliced into a 56-dimensional vector, which is used as the real input of the first multi-layer perceptron F _θ .

Map v _x and v _y in (v _x , v _y , v _d ) to 6-dimensional vectors (h is set to 3) respectively through formula 2, and sequentially splice them into 12-dimensional vectors as the second multi-layer perceptron G _φ input of.

a. Use the first multi-layer perceptron F _θ to take the position encoding vector of the spatial coordinates (x, y, d) of point P as input, and learn the alpha value of the transparency map of point P in the corresponding MPI and the corresponding color RGB coefficients in the graph [k ₁ ^P ,...,k _N ^P ];

b. Use the second multi-layer perceptron G _φ to input the position encoding vectors of v _x and v _y in the unit direction vector v of point P relative to the target viewpoint (ie, the observation point), and learn the color map of point P in MPI basis function H _n (v);

c. Learn the basic RGB value k ₀ ^P of point P in the MPI color map by means of explicit storage training;

D. adopt the method shown in above formula (9) to calculate and obtain the RGB value of P point in MPI color map;

e. A new view under the target viewpoint is obtained by rendering the MPI, the camera parameters corresponding to the first reference viewpoint, and the camera parameters corresponding to the target viewpoint. The rendering method adopts the standard inverse homography (Standard inverse homography), as shown in the following formula (11):

Among them, R and t are the rotation matrix and translation vector from the camera coordinate system of the first reference viewpoint to the camera coordinate system of the target viewpoint in the world coordinate system. a is the negative of the plane depth value in MPI. n=(0,0,1) is the unit normal vector of the MPI plane in the camera coordinate system of the first reference viewpoint. k _s and k _t are camera internal parameters corresponding to the first reference viewpoint and the target viewpoint respectively. (u _t ,v _t ,1) are the homogeneous coordinates of the pixels in the image (that is, the view to be rendered) under the target viewpoint. (u _s ,v _s ,1) are the homogeneous coordinates of point P in the corresponding plane under the first reference viewpoint.

For each pixel point (u _t ,v _t ,1) in the view to be rendered, there is a corresponding point (u _si ,v _si ,1) in the i-th plane (sorted from far to near) in MPI . As shown in FIG. 6 , assuming that the number of planes of the MPI is D, each pixel point (u _t , v _t ,1) has D corresponding pixel points in the MPI.

f. A series of P points (u _si ,v _si ,1) corresponding to the pixel points (u _t ,v _t ,1) in the view to be rendered are obtained from step e. From the RGB value C _i and the α value α _i at point P, the RGB value C _t of the pixel point (u _t ,v _t ,1) is calculated according to the following formula 12:

g. During the training process, the output image (that is, the synthetic view) is compared with the real view under the target viewpoint, and the difference between the two is measured by reconstruction loss. The reconstruction error L _rec is calculated according to the following formula (13):

in,

is the synthetic view of the NeX model, I is the real view under the target viewpoint, and the balance weight ω=0.05.

h. In the training process, in order to ensure the smoothness of the output image, a regular term total variation loss TV(k ₀ ) is introduced. The two together constitute the loss function L in the training process, as shown in the following formula (14):

L=L _rec +γ TV(k ₀ ) (14);

Among them, the coefficient of the regular term γ=0.03.

The scene representation adopted by the NeX model is the MPI representation combined with basis functions. Therefore, the NeX model incorporates the flaws of MPI. In real-world scenarios, most regions of space have no visible surfaces. Intuitively, most areas of the color map and transparency map in MPI are invalid values, that is, they do not contain visible information. For example, the MPI plane layer example shown in Figure 7 shows the 40th plane layer (a) to the 45th plane Layer (f). The first row is a color map, and the second row is a transparency map (black is the invalid area).

In the final learning results, most of the MPI regions are invalid values. This is because the depth range of MPI and the distribution of planes are given in advance, ignoring the position information of the visible planes in the scene. From the sampling point of view, there is a disconnect between the sampling position of MPI and the effective information position in the scene (the position with visible surfaces), which leads to the low sampling rate of MPI, which is manifested as the lack of details of NeX's synthetic view.

Further, in some embodiments, a scene perspective rendering model based on PMPI and basis functions (hereinafter referred to as PMPI model) is provided. The block multi-plane image introduces the depth information of the scene on the basis of MPI, and thus obtains the shape (region division and depth range of each region) that changes adaptively with the depth of the scene. Using PMPI and basis functions as scene representation, a complete new view rendering model is built around the characteristics of PMPI. The model takes a sparse view of the scene as input (no need to input a depth map of the scene), and outputs a new view of a given viewpoint (i.e., a synthetic view under the target viewpoint).

The workflow of the model is shown in Figure 8, including the following steps d to g:

d. determine the shape of the PMPI (region boundary and the depth range of each region) by the depth map, as shown in Figure 9 for example, wherein, 901 is under the situation that area number A is less than 10, carries out region by algorithm Otsu algorithm to depth map The result of the division, for example, the final division into a foreground mask and a background mask. 902 is the result obtained by performing region division on the depth map through the superpixel segmentation algorithm when the region number A is greater than or equal to 10. The number A of PMPI areas and the maximum depth dmax need to be given in advance according to the complexity and maximum depth of the scene. In some embodiments the number of regions A=2;

e. Adopt the first multi-layer perceptron MLP1 to use the 56-dimensional position encoding of the spatial coordinates (x, y, d) (hereinafter also referred to as P point) of the PMPI midpoint as input, and learn the transparency map of the P point in the PMPI α values and RGB coefficients in the colormap [k ₁ ^P ,...,k _N ^P ];

f. learn the basic RGB value k ₀ ^P of point P in the PMPI color map by means of explicit storage training;

g. The second multi-layer perceptron MLP2 is used to input the position encoding vector of v _x and v _y in the unit direction vector v of point P relative to the target viewpoint (observation point), and learn the basis of point P in the color map in PMPI. Function H _n (v), the number of basis functions N=8;

h. adopt the method shown in above-mentioned formula (9) to calculate and obtain the RGB value of P point in PMPI color map;

i. A new view under the target viewpoint is obtained by rendering the PMPI, the camera parameters corresponding to the first reference viewpoint, and the camera parameters corresponding to the target viewpoint. The rendering method is still the standard inverse homography shown in the above formula (11).

For each pixel of the view to be rendered, the standard inverse homography shown in the above formula (11) is used to obtain its corresponding point P coordinates at each depth. Taking PMPI with area number A=2 and depth number as 4 as an example, 7 PMPI corresponding points can be obtained (PMPI has 7 different depths because the farthest depth of the area is the same). However, as shown in FIG. 10 , the number of P points corresponding to the pixels of the view to be rendered is uncertain (less than or equal to 7), and the depth thereof will also change with the pixels of the view to be rendered. In some embodiments, using the foreground mask and the background mask obtained by the region segmentation of the depth map, the coordinates of the 7 points are screened to obtain an effective P point;

j. Using Formula 4 to synthesize the RGB values of the pixels of the view to be rendered;

k. Calculate the reconstruction loss according to the synthetic view under the target viewpoint and the real view under the target viewpoint, and use Equation 6 as the loss function in the training process.

Among them, the determination process of the depth map is shown in Figure 11, wherein, ①: obtain the MPI through the simplified version (remove the basis function) NeX model; ②: synthesize the disparity map from the transparency map (α map) of the MPI; ③: use the disparity map And the sparse point cloud computing depth map obtained by colmap:

a. For the sparse view of the scene, use the colmap tool for camera parameter estimation and multidimensional stereo reconstruction (MVS) from motion to structure (SfM) to obtain a sparse point cloud. The coordinates of the sparse point cloud are (x,y,d), where d is the depth relative to the reference camera;

b. Take the sparse view of the scene as input, as shown in Figure 5, use the NeX model to synthesize the MPI representation of the scene, but the number of basis functions is set to 0. That is, the non-Lambertian surface reflection effect in the scene is not considered;

c. The disparity map of the scene is synthesized by the transparency layer of MPI. The sparse point cloud computing depth map obtained from the disparity map and colmap is shown in ③ in Figure 11.

Synthesize the disparity map according to the following formula (15)

Among them, d _i is the depth of the ith plane of MPI (ordered from far to near). Parallax and depth are inversely proportional, and the inverse proportionality coefficient is determined. The inverse proportional coefficient is denoted as σ. In some embodiments, the optimal value σ' of σ is obtained by minimizing the L2 loss as shown in the following equation (16):

Among them, P _s is the sparse point cloud, and (x, y, d) are the coordinates of the point in the camera coordinate system of the first reference viewpoint. The depth map can be calculated from the disparity map and the inverse proportional coefficient.

The PMPI model is compared with the NeX model on the real scene fern and trex. The output image size is fixed at 1008×756. The PMPI model is trained for 400 rounds in the process of synthesizing MPI, 4000 rounds of synthetic PMPI training, and 4000 rounds of NeX training. The rest of the settings are the same. The training time of the two is almost the same. The data comparison between the two on the test set is shown in Table 1 below:

Table 1 Comparison of test performance between PMPI model (referred to as PMPI) and NeX model

The intuitive results of the synthesis are shown in Figures 12 and 13, where Figure 12 is the comparison of the synthesis effects in the fern scene, the

red boxes

121 and 122 are the synthesis results of the NeX model, and the

red boxes

123 and 124 are the synthesis results of the PMPI model. Figure 13 is a comparison of the synthesis effect in the trex scene, the

red boxes

131 and 132 are the synthesis results of the NeX model, and the

red boxes

133 and 134 are the synthesis results of the PMPI model.

It can be seen that the synthesis result of PMPI model is better than that of NeX. And benefiting from the advantages of PMPI in the background area, the synthetic results of the PMPI model are more prominent than the details of the NeX model in the background of the scene.

Based on the aforementioned embodiments, the image processing device provided by the embodiments of the present application, including the included modules and the units included in each module, can be implemented by various types of processors; of course, it can also be implemented by specific logic circuit implementation.

FIG. 14 is a schematic structural diagram of an image processing device according to an embodiment of the present application. As shown in FIG. 14 , the image processing device 14 includes:

A region division module 141, configured to perform region division on the depth map according to the depth of the first pixel in the depth map under the first reference viewpoint, to obtain at least one region;

A coordinate inverse transformation module 142, configured to inversely transform the coordinates of the m th second pixel point of the view to be rendered under the target viewpoint into at least one target area in the at least one area, to obtain the m th second pixel point Existing position points of pixels in the at least one target area; wherein, m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered;

The rendering module 143 is configured to render the color of the mth second pixel according to the position of the mth second pixel in the at least one target area.

In some embodiments, the area division module 141 is configured to: determine the depth relationship between the first pixels according to the depth of the first pixel in the depth map; and determine the depth relationship between the first pixels according to the depth relationship. The depth map is divided into regions to obtain at least one region.

In some embodiments, the area division module 141 is configured to: divide the first pixel points with the same depth or a depth difference within a specific range into the same area.

In some embodiments, the coordinate inverse transformation module 142 is configured to: determine the transformation relationship between the camera coordinate system where the first reference viewpoint is located and the camera coordinates where the target viewpoint is located; obtain the camera corresponding to the first reference viewpoint The internal reference and the camera internal reference corresponding to the target viewpoint; determining the region depth of the at least one target region; according to the transformation relationship, the camera internal reference corresponding to the first reference viewpoint and the target viewpoint, and the region depth, Inverse homography transformation is performed on the coordinates of the m th second pixel to obtain the existing position of the m th second pixel in the at least one target area.

In some embodiments, the rendering module 143 is configured to: select the position points satisfying the conditions from the position points of the mth second pixel point in the at least one target area as valid position points; and According to the effective position point, render the color of the mth second pixel point.

In some embodiments, the rendering module 143 is configured to: determine the color coefficient, transparency, basic color value and basis function of the effective position point; wherein, the argument of the basis function is the effective position point and the The relative direction of the target viewpoint; according to the color coefficient, basic color value and basis function of the effective position point, the observed color value of the effective position point from the relative direction is obtained; the effective position point of each Combining the transparency and the observed color value to obtain a composite color value; using the composite color value to render the color of the mth second pixel.

In some embodiments, the rendering module 143 is configured to: obtain the transparency and color coefficient of the effective location point according to the coordinates of the effective location point and the trained first multi-layer perceptron; Coordinates to obtain the basic color value of the effective location point; according to the relative direction and the trained second multi-layer perceptron, obtain the basis function of the effective location point.

In some embodiments, the rendering module 143 is configured to: map the coordinates of the effective location point into a vector with the first dimension; input the vector with the first dimension into the first multi-layer perceptron , to obtain the transparency and color coefficient of the effective location point.

In some embodiments, the rendering module 143 is configured to: map the relative direction into a vector with a second dimension; input the vector with the second dimension into the second multi-layer perceptron to obtain the The basis function of the effective position points.

In some embodiments, the image processing device 14 further includes an update module, configured to: obtain a synthetic view under the target viewpoint after the color of each second pixel of the view to be rendered is rendered; obtain the A real view under the target viewpoint; according to the synthetic view and the real view, a composite loss is obtained; according to the composite loss, parameter values of the first multilayer perceptron and the second multilayer perceptron are updated.

In some embodiments, the rendering module 143 is further configured to: use the updated first multi-layer perceptron and the updated second multi-layer perceptron to re-render the color of the mth second pixel until The synthetic loss of satisfies the condition or the number of updates satisfies the condition.

In some embodiments, the image processing device 14 further includes a depth map obtaining module, configured to: perform three-dimensional reconstruction on the included scene according to at least one view under the second reference viewpoint, and obtain the camera of the scene at the first reference viewpoint point cloud data in a coordinate system; determining a disparity map of the scene; and obtaining a depth map at the first reference viewpoint according to the disparity map and the point cloud data.

In some embodiments, the depth map obtaining module is configured to: obtain the transparency map of at least one plane of the scene according to at least one view under the second reference viewpoint; obtain the transparency map of the at least one plane and the corresponding Depth of the plane, the disparity map of the scene is synthesized.

In some embodiments, the depth map obtaining module is configured to: obtain an inverse proportional coefficient between the disparity map and the depth map according to the disparity map and the point cloud data; , to obtain the depth map under the first reference viewpoint.

The description of the above device embodiment is similar to the description of the above method embodiment, and has similar beneficial effects as the method embodiment. For technical details not disclosed in the device embodiments of the present application, please refer to the description of the method embodiments of the present application for understanding.

It should be noted that, in the embodiment of the present application, if the above method is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solutions of the embodiments of the present application or the part that contributes to the related technologies can be embodied in the form of software products. The computer software products are stored in a storage medium and include several instructions to make The electronic device executes all or part of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: various media that can store program codes such as U disk, mobile hard disk, read-only memory (Read Only Memory, ROM), magnetic disk or optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.

Correspondingly, the embodiment of the present application provides an electronic device. FIG. 15 is a schematic diagram of hardware entities of the electronic device according to the embodiment of the present application. As shown in FIG. 15 , the electronic device 15 includes a memory 151 and a processor 152. The memory 151 stores a computer program that can run on the processor 152, and the processor 152 implements the steps in the methods provided in the above-mentioned embodiments when executing the program.

It should be noted that the memory 151 is configured to store instructions and applications executable by the processor 152, and may also cache data to be processed or processed by each module in the processor 152 and the electronic device 15 (for example, image data, audio data, etc. , voice communication data and video communication data), can be implemented by flash memory (FLASH) or random access memory (Random Access Memory, RAM).

In some embodiments, as shown in FIG. 16 , the electronic device further includes a decoder 161 and a display device 162; wherein, the decoder 161 is configured to decode the code stream sent by the encoding end to obtain the depth under the first reference viewpoint and, transmitting the depth map to the processor 152; the processor 152 is configured to execute the steps in the image processing method provided in the above-mentioned embodiments according to the depth map, so as to finally obtain a synthetic view under the target viewpoint; and , transmit the synthesized view to the display device 162; the display device 162 displays or plays according to the received synthesized view; wherein, the processor 152 can divide the depth map into regions according to a specific region division algorithm, so as to obtain at least one region .

In some embodiments, the code stream may also carry the total number of regions, so that the processor 152 may perform region division on the depth map according to the total number of regions decoded by the decoder 161 .

Correspondingly, an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the method provided in the foregoing embodiments are implemented.

It should be pointed out here that: the descriptions of the above storage medium and device embodiments are similar to the descriptions of the above method embodiments, and have similar beneficial effects to those of the method embodiments. For technical details not disclosed in the storage medium and device embodiments of the present application, please refer to the description of the method embodiments of the present application for understanding.

It should be understood that reference throughout this specification to "one embodiment" or "an embodiment" or "some embodiments" means that a particular feature, structure, or characteristic related to the embodiment is included in at least one embodiment of the present application . Thus, appearances of "in one embodiment" or "in an embodiment" or "in some embodiments" in various places throughout the specification are not necessarily referring to the same embodiments. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the order of execution, and the execution order of the processes should be determined by their functions and internal logic, and should not be used in the embodiments of the present application. The implementation process constitutes any limitation. The serial numbers of the above embodiments of the present application are for description only, and do not represent the advantages and disadvantages of the embodiments.

It should be noted that, in this document, the term "comprising", "comprising" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article or apparatus comprising that element.

In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. The above-described embodiments of the touch screen system are only illustrative. For example, the division of the modules is only a logical function division. In actual implementation, there may be other division methods, such as: multiple modules or components can be combined , or can be integrated into another system, or some features can be ignored, or not implemented. In addition, the mutual coupling, or direct coupling, or communication connection between the various components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms of.

The modules described above as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules; they may be located in one place or distributed to multiple network units; Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present application can be integrated into one processing unit, or each module can be used as a single unit, or two or more modules can be integrated into one unit; the above-mentioned integration The modules can be implemented in the form of hardware, or in the form of hardware plus software functional units.

Those of ordinary skill in the art can understand that all or part of the steps to realize the above method embodiments can be completed by hardware related to program instructions, and the aforementioned programs can be stored in computer-readable storage media. When the program is executed, the execution includes The steps of the foregoing method embodiments; and the foregoing storage media include: removable storage devices, read-only memory (Read Only Memory, ROM), magnetic disks or optical disks and other media that can store program codes.

Alternatively, if the above-mentioned integrated units of the present application are realized in the form of software function modules and sold or used as independent products, they can also be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solutions of the embodiments of the present application or the part that contributes to the related technologies can be embodied in the form of software products. The computer software products are stored in a storage medium and include several instructions to make The electronic device executes all or part of the methods described in the various embodiments of the present application. The aforementioned storage medium includes various media capable of storing program codes such as removable storage devices, ROMs, magnetic disks or optical disks.

The methods disclosed in several method embodiments provided in this application can be combined arbitrarily to obtain new method embodiments under the condition of no conflict.

The features disclosed in several product embodiments provided in this application can be combined arbitrarily without conflict to obtain new product embodiments.

The features disclosed in several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments or device embodiments.

The above is only the embodiment of the present application, but the scope of protection of the present application is not limited thereto. Anyone familiar with the technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, and should covered within the scope of protection of this application. Therefore, the protection scope of the present application should be determined by the protection scope of the claims.

Claims

An image processing method, the method comprising:

performing region division on the depth map according to the depth of the first pixel in the depth map under the first reference viewpoint to obtain at least one region;

Inversely transforming the coordinates of the mth second pixel point of the view to be rendered under the target viewpoint into at least one target area in the at least one area, to obtain the mth second pixel point in the at least one target area Existing position points in the area; wherein, m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered;

Rendering the color of the mth second pixel according to the position of the mth second pixel in the at least one target area.
The method according to claim 1, wherein, according to the depth of the first pixel in the depth map under the first reference viewpoint, the depth map is divided into regions to obtain at least one region, including:

determining a depth relationship between the first pixels according to the depths of the first pixels in the depth map;

According to the depth relationship, region division is performed on the depth map to obtain at least one region.
The method according to claim 2, wherein, according to the depth relationship, performing region division on the depth map to obtain at least one region, comprising:

The first pixel points with the same depth or the depth difference within a specific range are divided into the same area.
The method according to any one of claims 1 to 3, wherein said inverse transforming the coordinates of the mth second pixel point of the view to be rendered under the target viewpoint to at least one target area in the at least one area , obtaining the existence position point of the mth second pixel point in the at least one target area area, including:

determining the transformation relationship between the camera coordinate system where the first reference viewpoint is located and the camera coordinate where the target viewpoint is located;

Acquiring internal camera parameters corresponding to the first reference viewpoint and internal camera parameters corresponding to the target viewpoint;

determining a zone depth of the at least one target zone;

According to the transformation relationship, the camera intrinsic parameters corresponding to the first reference viewpoint and the target viewpoint, and the region depth, perform a reverse homography transformation on the homogeneous coordinates of the m-th second pixel point to obtain An existing position of the mth second pixel in the at least one target area.
The method according to claim 1, wherein the rendering the color of the mth second pixel according to the position of the mth second pixel in the at least one target area includes :

From the existing position points of the mth second pixel point in the at least one target area, select the position points satisfying the conditions as valid position points;

According to the effective position point, render the color of the mth second pixel point.
The method according to claim 5, wherein said rendering the color of the mth second pixel according to the effective position point comprises:

Determine the color coefficient, transparency, basic color value and basis function of the effective position point; wherein, the argument of the basis function is the relative direction between the effective position point and the target viewpoint;

Obtaining the observed color value of the effective position point from the relative direction according to the color coefficient, basic color value and basis function of the effective position point;

Combining the transparency of each effective position point with the observed color value to obtain a composite color value;

Using the composite color value, render the color of the mth second pixel.
The method according to claim 6, wherein said determining the color coefficient, transparency and basic color value and basis function of said effective location point comprises:

Obtain the transparency and the color coefficient of the effective location point according to the coordinates of the effective location point and the trained first multi-layer perceptron;

Obtaining the basic color value of the effective location point according to the coordinates of the effective location point;

According to the relative direction and the trained second multi-layer perceptron, the basis functions of the effective position points are obtained.
The method according to claim 7, wherein, obtaining the transparency and the color coefficient of the effective position point according to the coordinates of the effective position point and the trained first multi-layer perceptron, comprising:

mapping the coordinates of the effective location point into a vector with a first dimension;

The vector with the first dimension is input into the first multi-layer perceptron to obtain the transparency and color coefficient of the effective position point.
The method according to claim 7, wherein said obtaining the basis function of said effective location point according to said relative direction and the trained second multi-layer perceptron comprises:

mapping the relative direction to a vector having a second dimension;

The vector with the second dimension is input into the second multi-layer perceptron to obtain the basis function of the effective location point.
The method according to claim 7, wherein the method further comprises:

After the color of each second pixel point of the view to be rendered is rendered, a synthetic view under the target viewpoint is obtained;

Obtaining a real view under the target viewpoint;

obtaining a synthetic loss based on the synthetic view and the real view;

Updating parameter values of the first multilayer perceptron and the second multilayer perceptron based on the combined loss.
The method according to claim 10, wherein the method further comprises:

Using the updated first multi-layer perceptron and the updated second multi-layer perceptron, re-render the color of the mth second pixel until the obtained composite loss meets the condition or the number of updates meets the condition.
The method according to claim 1, wherein the obtaining process of the depth map under the first reference viewpoint comprises:

performing three-dimensional reconstruction on the included scene according to at least one view under the second reference viewpoint, and obtaining point cloud data of the scene in the camera coordinate system of the first reference viewpoint;

determining a disparity map for the scene;

Obtain a depth map under the first reference viewpoint according to the disparity map and the point cloud data.
The method according to claim 12, wherein said determining the disparity map of said scene comprises:

Obtain a transparency map of at least one plane of the scene according to at least one view under the second reference viewpoint;

A disparity map of the scene is synthesized from the transparency map of the at least one plane and the corresponding plane depths.
The method according to claim 12, wherein said obtaining the depth map under the first reference viewpoint according to the disparity map and the point cloud data comprises:

Obtaining an inverse proportionality coefficient between the disparity map and the depth map according to the disparity map and the point cloud data;

A depth map under the first reference viewpoint is obtained according to the inverse proportional coefficient and the disparity map.
An image processing device, comprising:

An area division module, configured to perform area division on the depth map according to the depth of the first pixel in the depth map under the first reference viewpoint, to obtain at least one area;

A coordinate inverse transformation module, configured to inversely transform the coordinates of the mth second pixel point of the view to be rendered under the target viewpoint into at least one target area in the at least one area, to obtain the mth second pixel Existing position points of points in the at least one target area; wherein, m is greater than 0 and less than or equal to the total number of pixels of the view to be rendered;

A rendering module, configured to render the color of the mth second pixel according to the position of the mth second pixel in the at least one target area.
The device according to claim 15, wherein the area division module is configured to:

determining a depth relationship between the first pixels according to the depths of the first pixels in the depth map; and

According to the depth relationship, region division is performed on the depth map to obtain at least one region.
The device according to claim 16, wherein the area division module is configured to: divide the first pixel points with the same depth or a depth difference within a specific range into the same area.
The device according to any one of claims 15 to 17, wherein the rendering module is configured to:

From the existing position points of the mth second pixel point in the at least one target area, select the position points satisfying the condition as valid position points; and

According to the effective position point, render the color of the mth second pixel point.
An electronic device, comprising a memory and a processor, the memory stores a computer program that can run on the processor, and the processor implements the image processing method described in any one of claims 1 to 14 when executing the program A step of.
A computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps in the image processing method according to any one of claims 1 to 14 are realized.