WO2024093282A1

WO2024093282A1 - Image processing method, related device, and structured light system

Info

Publication number: WO2024093282A1
Application number: PCT/CN2023/103013
Authority: WO
Inventors: 宋钊; 曹军; 刘利刚
Original assignee: 华为技术有限公司
Priority date: 2022-10-31
Filing date: 2023-06-28
Publication date: 2024-05-10
Also published as: CN117994348A

Abstract

The present application discloses an image processing method, a related device, and a structured light system, which can be applied to a structured light system. The method comprises: acquiring at least three image groups, the at least three image groups being reflection images of a surface of an object for a material coding pattern at at least three angles of view, and the at least three image groups being in one-to-one correspondence with the at least three angles of view; acquiring an initial depth of the surface of the object corresponding to any one of the at least three image groups; and generating parameter information of the object on the basis of the at least three image groups and the initial depth, the parameter information comprising a material map parameter and/or a geometric structure parameter. The parameter information is generated on the basis of the at least three image groups and the initial depth of the surface of the object by means of the three image groups, of the surface of the object, which are obtained by reflecting the material coding pattern at the at least three angles of view, and the parameter information comprises the material map parameter and/or the geometric structure parameter. Therefore, the material map parameter is acquired.

Description

Image processing method, related equipment and structured light system

This application claims priority to the Chinese patent application filed with the China Patent Office on October 31, 2022, with application number 202211349694.0 and invention name “An image processing method, related equipment and structured light system”, the entire contents of which are incorporated by reference in this application.

Technical Field

The present application relates to the field of image processing, and in particular to an image processing method, related equipment and a structured light system.

Background technique

Structured light technology is a 3D reconstruction technology based on triangulation. A typical structured light system consists of a camera and a projector. During the scanning process, the projector first projects a pattern with specific coded information onto the surface of the target scene. The industrial camera then obtains the reflected coded information, and then decodes to establish the correspondence between the projector and camera pixels. Finally, the depth information of the target scene is obtained based on the principle of triangulation.

However, with the development of the metaverse and three-dimensional digitalization industry, three-dimensional reconstruction technology and systems need to meet the requirements of high precision and high realism at the same time, that is, high-precision geometry and material mapping.

Therefore, how to obtain material maps is a technical problem that needs to be solved urgently.

Summary of the invention

The embodiments of the present application provide an image processing method, related equipment and a structured light system for acquiring material maps.

The first aspect of the embodiment of the present application provides an image processing method that can be applied to a structured light system. The method can be performed by an image processing device, or by a component of the image processing device (such as a processor, a chip, or a chip system, etc.). The method includes: obtaining at least three image groups, at least three image groups are reflection images of the object surface for a material coding pattern at at least three viewing angles; the at least three image groups correspond one-to-one to the at least three viewing angles; obtaining the initial depth of the object surface corresponding to any one of the at least three image groups; generating parameter information of the object based on the at least three image groups and the initial depth, the parameter information including material mapping parameters and/or geometric structure parameters.

In the embodiment of the present application, at least three image groups are obtained by reflecting the material coding pattern from the object surface at least three viewing angles, and parameter information is generated based on the initial depth of the at least three image groups and the object surface, and the parameter information includes material mapping parameters and/or geometric structure parameters, thereby achieving the acquisition of material mapping parameters.

Optionally, in a possible implementation of the first aspect, the method is applied to a structured light system, the structured light system comprising a camera, a projection device, a rotating device, and an object connected to the rotating device, the rotating device being used to rotate multiple times so that the object is located at at least three viewing angles; acquiring at least three image groups, comprising: triggering the projection device to project a material coding pattern onto the object at each of the at least three viewing angles; triggering the camera to collect an image group reflected by the object for the material coding pattern at each viewing angle, so as to acquire at least three image groups. The initial depth is obtained by the projection device projecting the structured light coding pattern onto the object.

In this possible implementation, compared with the existing material measurement solution based on structured light, which requires additional light sources and RGB cameras to obtain multi-view RGB images, this embodiment changes the encoding strategy of structured light, and can obtain multi-view RGB images required for material modeling without additional light sources and cameras.

Optionally, in a possible implementation manner of the first aspect, the material coding pattern includes a full black pattern and a full white pattern. Each of the at least three image groups includes two reflection images.

In this possible implementation, by adding two material coding patterns, one completely white and one completely black, the multi-view RGB images required for material modeling can be obtained without adding additional light sources and cameras.

Optionally, in a possible implementation of the first aspect, the above-mentioned step: generating parameter information of the object based on at least three images and the initial depth includes: obtaining occlusion information of the spatial point cloud of the object surface in the target image group corresponding to the spatial point cloud in the two image groups, any one of the image groups is the target image group, and the two image groups are two image groups other than the target image group in the at least three image groups; eliminating pixel values corresponding to the occlusion information in the two image groups to obtain a visualization matrix; obtaining the image matrix of each point on the surface of the object based on the pose calibration information and the initialization depth. The observation matrix under at least three viewing angles includes: the incident light direction, the reflected light direction, the pixel observation values under at least three viewing angles of the light source intensity; the posture calibration information includes: the intrinsic parameters of the projection device and the camera, the extrinsic parameters between the projection device and the camera, the projection device is used to project the material coding pattern, and the camera is used to collect at least three image groups; the parameter information is determined based on the visualization matrix and the observation matrix.

In this possible implementation, the observation matrix and the visualization matrix can be obtained through the posture calibration information and the relative position relationships, and then the parameter information of the object can be obtained according to the observation matrix and the visualization matrix.

Optionally, in a possible implementation of the first aspect, the above steps: determining parameter information based on the visualization matrix and the observation matrix, include: constructing an energy function based on the visualization matrix and the observation matrix, the energy function being used to represent the difference between the estimated value and the observed value of each point on the surface of the object, the estimated value being related to the visualization matrix, and the observed value being related to the observation matrix; minimizing the value of the energy function to obtain parameter information.

In this possible implementation, by constructing an energy function of estimated values and observed values based on the visualization matrix and the observation matrix, the material mapping parameters and/or geometric structure parameters can be optimized in the process of minimizing the energy function.

Optionally, in a possible implementation manner of the first aspect, the parameter information includes: material mapping parameters and/or geometric structure parameters, the geometric structure parameters include optimized depth or initialized depth; the energy function is shown in Formula 1:

Formula 1:

The material mapping parameters include is the diffuse reflection variable; is the specular reflection variable; is the roughness; z ^* is the geometric structure parameter; is the estimated value of any point on the surface of the object at different viewing angles, and the calculation method of the estimated value is shown in Formula 2, where ^Ii is the observation value of any point at any viewing angle obtained by the camera (it can also be understood as the pixel difference between the reflection image corresponding to the all-black pattern and the reflection image corresponding to the all-white pattern at any viewing angle); i is the number of different viewing angles; E is the regularization term;

Formula 2:

Wherein, E ⁱ is the light source intensity of any point at any viewing angle, d is the distance between any point and the projector, f() is the reflection characteristic function, and the reflection characteristic function is shown in Formula 3, n is the surface normal vector of any point, l ⁱ is the incident light direction of any point at any viewing angle, and ^vi is the reflected light direction of any point at any viewing angle;

Formula 3:

in, is the initial diffuse reflectance variable; is the initial specular reflection variable; _rs is the initial roughness; D() represents the microfacet distribution function, and G() represents the geometric attenuation coefficient.

The second aspect of the embodiment of the present application provides an image processing method, which can be applied to a structured light system, the structured light system comprising: a camera, a projection device, a rotating device and an object. The method can be performed by an image processing device, or by a component of the image processing device (such as a processor, a chip, or a chip system, etc.). The method comprises: triggering/controlling a projection device to project a material coding pattern onto an object; triggering/controlling a camera to collect a reflection image of the object at different viewing angles for the material coding pattern, the reflection image being used to generate a material map of the object; triggering/controlling a rotating device to rotate the object to achieve that the object is located at different viewing angles.

In this embodiment, compared with the existing material measurement scheme based on structured light, which requires additional light sources and RGB cameras to obtain multi-view RGB images, this embodiment changes the encoding strategy of structured light, and can obtain multi-view RGB images required for material modeling without additional light sources and cameras.

Optionally, in a possible implementation manner of the second aspect, the method further includes: generating material mapping parameters of the object based on reflection images at different viewing angles.

Optionally, in a possible implementation manner of the second aspect, the material coding pattern includes a full black pattern and a full white pattern. The number of the reflected images at each of the different viewing angles is two.

The third aspect of the embodiment of the present application provides an image processing device that can be applied to a structured light system. The image processing device includes: an acquisition unit for acquiring at least three image groups, the at least three image groups being reflection images of the object surface for the material coding pattern at at least three viewing angles; the acquisition unit is also used to acquire the initial depth of the object surface corresponding to any one of the at least three image groups; a generation unit is used to generate parameter information of the object based on the at least three image groups and the initial depth, the parameter information including material mapping parameters and/or geometric structure parameters. number.

Optionally, in a possible implementation of the third aspect, the above-mentioned image processing device is applied to a structured light system, the structured light system includes a camera, a projection device, a rotating device, and an object connected to the rotating device, the rotating device is used to rotate multiple times so that the object is located at at least three viewing angles; an acquisition unit is specifically used to trigger the projection device to project a material coding pattern to the object at each of the at least three viewing angles; the acquisition unit is specifically used to trigger the camera to collect an image group reflected by the object for the material coding pattern at each viewing angle, so as to obtain at least three image groups. The at least three image groups correspond to the at least three viewing angles one by one; the initial depth is obtained by the projection device projecting the structured light coding pattern to the object.

Optionally, in a possible implementation manner of the third aspect, the material coding pattern includes a full black pattern and a full white pattern. The number of the reflected images at each of the different viewing angles is two.

Optionally, in a possible implementation manner of the third aspect, the above-mentioned generation unit is specifically used to obtain occlusion information of the spatial point cloud on the surface of the object in the target image group corresponding to the spatial point cloud in the two image groups, any one of the image groups is the target image group, and the two image groups are two image groups other than the target image group in at least three image groups; the generation unit is specifically used to eliminate the pixel values corresponding to the occlusion information in the two image groups to obtain a visualization matrix; the generation unit is specifically used to obtain the observation matrix of each point on the surface of the object under at least three viewing angles based on the pose calibration information and the initialization depth, the observation matrix including: pixel observation values under at least three viewing angles of incident light direction, reflected light direction, and light source intensity, the pose calibration information including: intrinsic parameters of the projection device and the camera, extrinsic parameters between the projection device and the camera, the projection device is used to project material coding patterns, and the camera is used to collect at least three image groups; the generation unit is specifically used to determine parameter information based on the visualization matrix and the observation matrix.

Optionally, in a possible implementation of the third aspect, the above-mentioned generation unit is specifically used to construct an energy function based on the visualization matrix and the observation matrix, the energy function is used to represent the difference between the estimated value and the observed value of each point on the surface of the object, the estimated value is related to the visualization matrix, and the observed value is related to the observation matrix; the generation unit is specifically used to minimize the value of the energy function to obtain parameter information.

Optionally, in a possible implementation manner of the third aspect, the parameter information includes: material mapping parameters and/or geometric structure parameters, the geometric structure parameters include optimized depth or initialized depth; the energy function is shown in Formula 1:

Formula 1:

Formula 2:

Formula 3:

The fourth aspect of the embodiment of the present application provides an image processing device that can be applied to a structured light system. The image processing device includes: a control unit for triggering/controlling a projection device to project a material coding pattern onto an object; the control unit is also used to trigger/control a camera to collect reflection images of the object at different viewing angles for the material coding pattern, and the reflection images are used to generate material mapping parameters of the object; the control unit is also used to trigger/control a rotation device to rotate the object to achieve the object being located at different viewing angles.

Optionally, in a possible implementation manner of the fourth aspect, the above-mentioned image processing device also includes a generation unit, which is used to generate material mapping parameters of the object based on the reflection images at different viewing angles.

Optionally, in a possible implementation manner of the fourth aspect, the material coding pattern includes a full black pattern and a full white pattern. The number of the reflected images at each of the different viewing angles is two.

A fifth aspect of an embodiment of the present application provides a structured light system, which includes: a camera, a projection device, a rotating device and an object; the camera is used to collect reflection images of the object at different viewing angles for a material coding pattern, and the material coding pattern is used to obtain material mapping parameters of the object; the projection device is used to project the material coding pattern onto the surface of the object; the rotating device is used to rotate the object to achieve that the object is located at different viewing angles.

A sixth aspect of an embodiment of the present application provides an image processing device, comprising: a processor, the processor being coupled to a memory, the memory being used to store programs or instructions, and when the programs or instructions are executed by the processor, the image processing device is enabled to implement the method in the above-mentioned first aspect or any possible implementation of the first aspect, or the image processing device is enabled to implement the above-mentioned second aspect or any possible implementation of the second aspect.

A seventh aspect of an embodiment of the present application provides a computer-readable storage medium storing one or more computer-executable instructions. When the computer-executable instructions are executed by a processor, the processor executes the method described in the first aspect or any possible implementation of the first aspect, or executes the method described in the second aspect or any possible implementation of the second aspect.

An eighth aspect of an embodiment of the present application provides a computer program product (or computer program) storing one or more computers. When the computer program product is executed by the processor, the processor executes the method of the first aspect or any possible implementation of the first aspect, or executes the method of the second aspect or any possible implementation of the second aspect.

A ninth aspect of an embodiment of the present application provides a chip system, which includes at least one processor for supporting an image processing device to implement the functions involved in the above-mentioned first aspect or any possible implementation of the first aspect, or to implement the functions involved in the above-mentioned second aspect or any possible implementation of the second aspect.

In one possible design, the chip system may also include a memory for storing program instructions and data necessary for the first communication device. The chip system may be composed of a chip, or may include a chip and other discrete devices. Optionally, the chip system also includes an interface circuit, which provides program instructions and/or data for the at least one processor.

It can be seen from the above technical solutions that the present application has the following advantages: at least three image groups are obtained from the reflection of the material coding pattern on the object surface at at least three viewing angles, and parameter information is generated based on the at least three image groups and the initial depth of the object surface, and the parameter information includes material mapping parameters and/or geometric structure parameters. Thus, the acquisition of material mapping parameters is realized.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG1 is a schematic diagram of the structure of an application scenario provided by an embodiment of the present application;

FIG2 is a flow chart of a data processing method provided in an embodiment of the present application;

FIG3 is an example diagram of a structured light coding pattern and a material coding pattern provided in an embodiment of the present application;

FIG4 is another schematic flow chart of a data processing method provided in an embodiment of the present application;

FIG5 is another schematic flow chart of a data processing method provided in an embodiment of the present application;

FIG. 6 is an example diagram of a vase and a vase geometric structure provided in an embodiment of the present application;

FIG. 7 is an example diagram of a texture map generated by the prior art and an object reconstruction result based on the texture map;

FIG8 is an example diagram of a material map and an object reconstruction result based on the material map provided in an embodiment of the present application;

FIG9 is a schematic diagram of the structure of the system hardware provided in an embodiment of the present application;

FIG10 is a schematic diagram of a structure of an image processing device provided in an embodiment of the present application;

FIG. 11 is another schematic diagram of the structure of the image processing device provided in an embodiment of the present application.

Detailed ways

The present application provides an image processing method, related equipment and structured light system for acquiring material maps.

At present, structured light technology can be divided into three development stages. In stage one, the structured light system only supports geometric output, without texture and material stickers. The structured light system in stage two supports texture map output on the basis of stage one geometry by adding additional red, green and blue (RGB) cameras and light sources, but the texture map cannot correctly separate the diffuse and highlight components, there is obvious highlight noise, and it does not support physical base rendering (PBR). In order to meet the needs of high realism and high-precision 3D modeling, the structured light system that can support PBR material map output is the development trend of the next stage.

On the one hand, as described in the background technology, the existing structured light technology can only obtain the depth information of the target scene. It is impossible to obtain the material map. On the other hand, at present, in order to obtain the multi-view images required for material measurement, on the basis of the structured light system, the existing hardware systems that support the spatial bidirectional reflectance distribution function (svBRDF) material measurement need to add additional light sources and cameras. However, this method often consists of multiple cameras, multiple projectors, and dozens of high-power white light emitting diode (LED) surface light sources in different directions. It increases the complexity of system integration, and the equipment is bulky and inconvenient to use.

To solve the above problems, the embodiment of the present application provides an image processing method, which generates parameter information based on the initial depth of the at least three image groups and the object surface, without adding additional light sources and cameras, by obtaining at least three image groups reflected by the material coding pattern from the object surface at at least three viewing angles, and the parameter information includes material mapping parameters and/or geometric structure parameters, thereby achieving the acquisition of material mapping parameters.

To facilitate understanding, the relevant terms and concepts mainly involved in the embodiments of the present application are first introduced below.

1. Material Mapping

Material maps that support svBRDF include: diffuse map, specular map, roughness map, and normal map.

Before describing the method provided in the embodiment of the present application, the application scenario to which the method provided in the embodiment of the present application is applicable is described. The scenario to which the method provided in the embodiment of the present application is applicable may be the structured light system shown in FIG1 . The structured light system includes: a camera 101, a projection device 102, a rotating device 103, and an object 104.

The camera 101 is used to collect reflection images of the object 104 at different viewing angles with respect to the coding pattern. The coding pattern includes a material coding pattern, and the material coding pattern is used to obtain material mapping parameters.

The projection device 102 is used to project a coded pattern onto the surface of the object 104 .

The rotating device 103 is used to rotate the object 104 so that the object 104 is located at different viewing angles.

The object 104 can be understood as an object to be three-dimensionally scanned.

Optionally, the rotating device 103 is used to place the object 104. It is understandable that the rotating device 103 may also include a plurality of brackets to support or fix the object 104, so as to realize the rotation of the rotating device 103 and drive the object 104 to rotate at the same time.

Optionally, the coding pattern may further include a structured light coding pattern, which is used to obtain the machine geometry (such as depth) of the object.

During the acquisition process, the projection device 102 projects a specific coded pattern onto the object 104, and the camera 101 takes a picture to obtain the corresponding reflected image. After one scan is completed, the rotation device 103 rotates a specific angle and repeats the above image acquisition process multiple times, which is usually greater than 2 times and can be set according to actual needs. During the calculation process, the image processing device generates parameter information of the object 104 based on the reflected images obtained in the above multiple acquisition processes to achieve three-dimensional reconstruction of the object 104.

The image processing device in the embodiments of the present application can be a server, a mobile phone, a tablet computer (pad), a portable game console, a personal digital assistant (PDA), a laptop computer, an ultra mobile personal computer (UMPC), a handheld computer, a netbook, a car media player, a wearable electronic device, a virtual reality (VR) terminal device, an augmented reality (AR) terminal device, and other devices with sufficient computing power.

The following is a detailed description of the image processing method provided by the embodiment of the present application. The method can be performed by an image processing device. It can also be performed by a component of the image processing device (such as a processor, a chip, or a chip system, etc.). The method can be applied to the structured light system shown in FIG. 1. Please refer to FIG. 2, which is a flow chart of the image processing method provided by the embodiment of the present application. The method may include steps 201 to step 202. Step 207. Steps 201 to 207 are described in detail below.

Step 201, triggering the projector.

The image processing device sends a first trigger signal to the projector (ie, the projection device in FIG1 ). Correspondingly, the projector receives the first trigger signal sent by the image processing device. The first trigger signal is used by the projector to project a structured light coding pattern onto an object.

Step 202: Project structured light coding.

After the projector receives the first trigger signal, the projector projects a structured light coding pattern (or structured light coding) onto the object. The structured light coding pattern is used to obtain the depth of the object.

Exemplarily, the structured light coding pattern is shown in (a) of FIG3. It is understandable that (a) of FIG3 is only an example of 8 coding patterns (corresponding to 8 rows respectively). In practical applications, there may be fewer (e.g., 4, 5, etc.) or more (e.g., 16, 20, etc.) coding patterns. In addition, the structured light coding pattern may include multiple black and white patterns or multiple patterns corresponding to 0-255.

Step 203, triggering camera acquisition.

After the projector projects the structured light coding pattern, the image processing device sends second trigger information to the camera to trigger the camera to acquire the image reflected by the object surface for the structured light coding pattern. The second trigger information is used for the camera to collect the reflected image of the object surface.

Optionally, this image can be used as input to structured light decoding to obtain an initial depth of the object.

Step 204, projecting the material code.

After completing the structured light coding projection and acquisition, the image processing device sends a third trigger message to the projector to trigger the projector to project a material coding pattern (or material coding). The third trigger message is used for the projector to project a material coding pattern onto the object. The material coding pattern includes a full black pattern and a full white pattern.

Exemplarily, the material coding pattern is shown in (b) of FIG3 .

Step 205, triggering camera acquisition.

After the projector projects the material coding pattern, the image processing device sends fourth trigger information to the camera to trigger the camera to acquire the RGB image reflected by the surface of the object.

Optionally, this RGB image can be used as input for photometric constraint modeling.

Step 206, turntable triggering.

The image processing device triggers the turntable (i.e., the rotating device in FIG1 ) to rotate to a specific angle, and obtains RGB images of different positions and postures through the relative movement of the object and each device in the structured light acquisition.

In addition, based on the set turntable angle, the spatial point cloud stitching and fusion of the objects corresponding to the RGB images in different postures are completed to obtain the scanning results.

Step 207, end the determination.

Determine whether the system has completed the preset number of acquisitions; if so, stop triggering the projector and end the acquisition; if not, repeat steps 201 to 207 until the end.

It can be understood that the above process takes the projection of light structure coding patterns and material coding patterns at each viewing angle as an example. In actual applications, light structure coding patterns can be projected at the main viewing angle, and material coding patterns can be projected at the main viewing angle and other viewing angles, etc., which are not limited here.

In this embodiment, by providing an image acquisition method, a multi-view RGB image required for material modeling can be obtained. And the multi-view RGB image is used as the input for material modeling. Compared with the existing material measurement scheme based on structured light, it is necessary to add an additional light source and an RGB camera to obtain a multi-view RGB image. In this embodiment, by changing the coding strategy of structured light, by adding two material coding patterns of all white and all black, that is, by projecting the pattern through a projection device, the reflected image of the object for the projection pattern is obtained, and the multi-view RGB image required for material modeling can be obtained without adding an additional light source and camera.

Please refer to FIG. 4, which is a flowchart of an image processing method provided in an embodiment of the present application. The method can be performed by an image processing device. It can also be performed by a component of the image processing device (such as a processor, a chip, or a chip system, etc.). The method can be applied to the structured light system shown in FIG. 1, and the method can include steps 401 to 403. Steps 401 to 403 are described in detail below.

Step 401: Acquire at least three image groups.

The at least three image groups in the embodiment of the present application are reflection images of the object surface with respect to the material coding pattern at at least three viewing angles, wherein the at least three image groups correspond one-to-one to the at least three viewing angles.

Optionally, the material coding pattern includes an all-black pattern and an all-white pattern. In this case, at each viewing angle, a first reflection image of the object for the all-black pattern and a second reflection image of the all-white pattern are obtained. At the first viewing angle, an image group can be obtained, and the image group includes the first reflection image and the second reflection image. That is, three image groups include six reflection images.

There are many ways for the image processing device to obtain at least three image groups in the embodiment of the present application. It can be by receiving images sent by other devices, by selecting from a database, or by obtaining at least three image groups through the method of the embodiment shown in Figure 2 above, etc. The specifics are not limited here.

Step 402: Obtain an initial depth of the object surface corresponding to any one of at least three image groups.

There are many ways for the image processing device to obtain the initial depth in the embodiment of the present application. It can be by receiving it from other devices, by selecting it from a database, or by obtaining at least three image groups through the method of the embodiment shown in Figure 2 above, etc. The specifics are not limited here.

Optionally, at least one viewing angle (i.e., at least the viewing angle corresponding to the subsequent target image group), the projection device projects a structured light coding pattern onto the surface of the object, the camera collects a reflection image of the object surface for the structured light coding pattern, and obtains the initial depth of the object at the viewing angle through the image.

It is understandable that the at least three image groups in the aforementioned step 401 may also include a reflection image of the object surface for the structured light coding pattern. For example, assuming a scene in which a structured light coding pattern is projected at each viewing angle, each of the aforementioned at least three image groups includes a first reflection image and a second reflection image at a certain viewing angle. Among them, the first reflection image is a reflection image of the object surface for the structured light coding pattern, and the second reflection image is a reflection image of the object surface for the material coding pattern. In other words, the initial depth can be obtained by processing at least three image groups, or it can be obtained by processing images other than at least three image groups (i.e., reflection images of the object surface for the structured light coding pattern). It is understandable that, assuming a scene in which a structured light coding pattern is projected at one viewing angle (or referred to as a main viewing angle), the aforementioned at least three image groups may be four image groups, and the four image groups include: reflection images of the object surface for the material coding pattern at three viewing angles and reflection images of the object surface for the structured light coding pattern at the main viewing angle. In addition, there is no limitation on the number of reflection images corresponding to the structured light coding pattern (for example, the description of (a) in the aforementioned FIG. 3).

Step 403: Generate parameter information of the object based on at least three image groups and the initial depth.

After acquiring at least three image groups and an initial depth, the image processing device generates parameter information of the object based on the at least three image groups and the initial depth, where the parameter information includes material mapping parameters and/or geometric structure parameters.

The process of step 403 may refer to Fig. 5. Fig. 5 includes steps 501 to 506. Steps 501 to 506 are described in detail below.

Step 501, key frame.

The image processing device uses the image group corresponding to any one of the at least three perspectives as a key frame. The perspective corresponding to the key frame is used as the main perspective. The point cloud contained in the key frame is the optimization target (or understood as determining the range of the material map), and the pixel coordinates corresponding to the point cloud in the key frame are the image foreground.

Optionally, the viewing angle corresponding to the initial depth in the above step 402 is used as the main viewing angle, and the image group corresponding to the initial depth is the target image group.

Step 502, adjacent frames are determined to complete image registration (or image calibration).

Based on the known pose relationship between different frames, at least two frames of images adjacent to the key frame are selected as adjacent frames. The point cloud coordinates in the key frame are reprojected to the pixel positions and RGB values in each adjacent frame.

Specifically, the image processing device can obtain the relative position relationship between at least three image groups and the object. And based on the relative position relationship, the occlusion information of the spatial point cloud of the object surface in the target image group corresponding to the spatial point cloud in the two image groups is obtained. The two image groups are two image groups other than the target image group in the at least three image groups. The occlusion information can be obtained by using methods such as reprojection, which are not specifically limited here.

Step 503, generate observation and visualization matrices.

After the image processing device obtains the occlusion information, the pixel values corresponding to the occlusion information in the two image groups are removed to obtain a visualization matrix. This process can be understood as reducing the noise caused by occlusion.

In addition, the image processing device can also obtain the view of each point on the surface of the object from at least three perspectives based on the pose calibration information and the initialization depth. The measurement matrix includes: pixel observation values under at least three viewing angles: incident light direction, reflected light direction, and light source intensity.

Among them, the pose calibration information includes: internal parameters of the projection device and the camera, and external parameters between the projection device and the camera. The projection device is used to project the material coding pattern, and the camera is used to collect at least three image groups.

Alternatively, the observation matrix can be Among them, for a point P on the surface of the object, L is the incident light direction of the point at the i viewing angle (it can also be understood as the irradiation direction of the projection device, that is, the irradiation direction of the light source), V is the reflected light direction of the point at the i viewing angle (it can also be understood as the observation direction of the camera), I is the pixel observation value at the point obtained by the camera at the i viewing angle (it can also be understood as the pixel difference between the reflected image corresponding to the all-black pattern and the reflected image corresponding to the all-white pattern at the i viewing angle). E is the light source intensity at the point at the i viewing angle.

After the image processing device obtains the visualization matrix and the observation matrix, the parameter information can be determined based on the visualization matrix and the observation matrix (as shown in steps 504 to 506 below). The parameter information includes: material mapping parameters and/or geometric structure parameters, and the geometric structure parameters include optimized depth or initialized depth. Wherein, in the case where the geometric structure parameters include the initialized depth, this embodiment can be understood as obtaining material mapping parameters. In the case where the geometric structure parameters include the optimized depth, this embodiment can be understood as optimizing the geometric structure parameters of the object.

Step 504: Establish an energy function.

The image processing device constructs an energy function based on the visualization matrix and the observation matrix. The energy function is used to represent the difference between the estimated value and the observed value of each point on the surface of the object. The estimated value is related to the visualization matrix, and the observed value is related to the observation matrix.

Optionally, the energy function is as shown in Formula 1:

Formula 1:

Among them, the material mapping parameters include is the optimized diffuse reflection variable; is the optimized specular reflection variable; is the optimized roughness; z ^* is the optimized geometric structure parameter; is the estimated value of any point on the surface of the object at different viewing angles. The calculation method of the estimated value is shown in Formula 2, where I ⁱ is the observation value of any point obtained by the camera at any viewing angle; i is the number of different viewing angles; and E is the regularization term.

Optionally, the regularization term includes regularization terms corresponding to normal vectors, depths, materials, etc., which are not specifically limited here.

Formula 2:

Wherein, E ⁱ is the light source intensity at any point at any viewing angle, d is the distance between any point and the projector, f() is the reflection characteristic function, and the reflection characteristic function is shown in Formula 3, n is the surface normal vector of any point, l ⁱ is the incident light direction at any point at any viewing angle, and ^vi is the reflected light direction at any point at any viewing angle;

Formula 3:

in, is the initial diffuse reflectance variable; is the initial specular reflection variable; _rs is the initial roughness; D() represents the microfacet distribution function, which is used to express the change of microfacet slope; G() represents the geometric attenuation coefficient. The incident light on the microfacet may be blocked by the adjacent microfacets before reaching a surface or after being reflected by the surface. This blocking will cause a slight dimming of the specular reflection, and the geometric attenuation coefficient can be used to measure this effect.

It can be understood that the above formula 1, formula 2 and/or formula 3 are just examples. In practical applications, the above formula 1, formula 2 and/or formula 3 can also be other forms, which are not specifically limited here.

Step 505: Minimize the energy function.

After the image processing device constructs the energy function, the value of the energy function can be minimized to obtain parameter information of the object. The parameter information also includes: material mapping parameters and/or geometric structure parameters, and the geometric structure parameters include optimized depth or initialized depth. Among them, when the geometric structure parameters include the initialized depth, this embodiment can be understood as obtaining material mapping parameters. When the geometric structure parameters include the optimized depth, this embodiment can be understood as optimizing the geometric structure parameters of the object.

In a possible implementation, the parameter information may include the above In this case, z ^* is the initial depth Z, i.e. The initial depth is a fixed value, and the material mapping parameters are the parameters to be optimized.

In another possible implementation, the parameter information may include the above-mentioned z ^* . for r _s is the preset value, that is, the material mapping parameters are fixed values, and the geometric structure parameters are the parameters to be optimized.

In another possible implementation, the parameter information may include the above z ^* , In this case, it can be understood that both the material mapping parameters and the geometric structure parameters are parameters to be optimized.

Optionally, the initial depth/optimized depth can be used to generate a normal map, the initial or optimized diffuse variables can be used to generate a diffuse map, the initial or optimized specular variables can be used to generate a specular map, and the initial or optimized roughness can be used to generate a roughness map.

Step 506: Convergence determination.

In the process of minimizing the value of the energy function, determine whether the convergence condition is met. If so, end the minimization of the energy function and output the optimal solution. If not, repeat steps 502 to 506 until the convergence condition is met. The convergence condition includes at least one of the following: the number of repetitions is a first preset threshold, the value of the energy function is less than a second preset threshold, etc.

It is understandable that after the calculation of the parameter information of the object under the main viewing angle is completed, a new viewing angle can be selected as the main viewing angle to repeat the above process, and finally the parameter information of the object under multiple viewing angles is obtained. Then, the object material map is generated according to the fusion splicing and other processing.

In an embodiment of the present application, at least three image groups are obtained by reflecting the material coding pattern on the surface of an object at least three viewing angles, and parameter information is generated based on the initial depth of the at least three image groups and the surface of the object, and the parameter information includes material mapping parameters and/or geometric structure parameters. On the one hand, without adding additional light sources, by changing the coding strategy of the structured light, the projection device is changed into a light source, which can support PBR material mapping output. On the other hand, the material map includes diffuse reflection map, specular reflection map, roughness map and normal map, which supports PBR rendering. On the other hand, the above-mentioned material modeling and solution algorithm can be used for mobile phone material measurement, providing a material modeling solution for the existing mobile phone-based three-dimensional reconstruction algorithm, and supporting the material output function of the mobile phone.

In order to more intuitively show the beneficial effects of the image processing method provided in the embodiment of the present application, the reconstruction of a vase is taken as an example below to exemplarily describe the reconstruction results of the vase using the prior art and the image processing method provided in the present application.

Among them, the object is a physical picture of a vase as shown in Figure 6 (a), and Figure 6 (b) is a geometric structure diagram of the object. Figure 7 (a) is a texture map generated by the prior art, and Figure 7 (b) is the reconstruction result of the texture map in the prior art. Figure 8 (a) is a material map obtained by the method of the embodiment of the present application, and Figure 8 (b) is the reconstruction result of the material map obtained by the method of the embodiment of the present application. Through comparison, it is found that the existing texture-based reconstruction results cannot correctly separate the diffuse reflection and highlight components, resulting in highlight noise in the reconstruction results based on texture maps, and does not support PBR rendering. The image processing method provided in the embodiment of the present application can effectively generate PBR material maps, obtain highly realistic reconstruction results, and reduce highlight noise.

In addition, the embodiment of the present application also provides a system hardware. The system hardware is shown in Figure 9, and the system hardware includes: a turntable unit, a control unit, a lighting unit, a sensor unit, a storage unit and a computing unit. During the acquisition process, the control unit first sends a trigger signal to enable the projector to project a specific coded pattern, and the projector triggers the camera to take pictures to obtain the corresponding image, and upload it to the storage unit. After one scan is completed, the control unit controls the turntable to rotate to a specific angle, and repeats the above image acquisition process to a preset number of times; after the complete scan is completed, the computing unit completes the calculation of the object parameter information (i.e., including the geometric structure and svBRDF).

The turntable unit may include a turntable and a power supply. The control unit may include a central processing unit (CPU) and a cache. The lighting unit includes a power supply and a projector. The sensor unit includes a camera and a transmission line. The storage unit includes a cache and an external storage. The computing unit includes a CPU, a graphics processing unit (GPU), a cache and a transmission line.

The image processing method in the embodiment of the present application is described above. The image processing device in the embodiment of the present application is described below. Please refer to FIG. 10. An embodiment of the image processing device in the embodiment of the present application includes:

An acquisition unit 1001 is used to acquire at least three image groups, where the at least three image groups are reflection images of the object surface at at least three viewing angles with respect to the material coding pattern;

The acquisition unit 1001 is further used to acquire an initial depth of the object surface corresponding to any one of the at least three image groups;

A generating unit 1002 is used to generate parameter information of an object based on at least three image groups and an initial depth, wherein the parameter information includes material map parameters. number and/or geometric parameters.

Optionally, the image processing device is applied to a structured light system, the structured light system comprising a camera, a projection device, a rotating device, and an object connected to the rotating device, the rotating device being used to rotate multiple times so that the object is located at at least three viewing angles;

Optionally, the acquisition unit 1001 is specifically used to trigger the projection device to project a material coding pattern onto the object at each of at least three viewing angles; the acquisition unit 1001 is specifically used to trigger the camera to capture an image group reflected by the object for the material coding pattern at each viewing angle to obtain at least three image groups.

Optionally, the material coding pattern includes an all-black pattern and an all-white pattern.

Optionally, the generation unit 1002 is specifically used to obtain the occlusion information of the spatial point cloud on the surface of the object in the target image group corresponding to the spatial point cloud in the two image groups, any one of the image groups is the target image group, and the two image groups are two image groups other than the target image group in at least three image groups; the generation unit 1002 is specifically used to eliminate the pixel values corresponding to the occlusion information in the two image groups to obtain a visualization matrix; the generation unit 1002 is specifically used to obtain the observation matrix of each point on the surface of the object under at least three viewing angles based on the pose calibration information and the initialization depth, the observation matrix including: the pixel observation values under at least three viewing angles of the incident light direction, the reflected light direction, and the light source intensity, the pose calibration information including: the intrinsic parameters of the projection device and the camera, the extrinsic parameters between the projection device and the camera, the projection device is used to project the material coding pattern, and the camera is used to collect at least three image groups; the generation unit 1002 is specifically used to determine the parameter information based on the visualization matrix and the observation matrix.

Optionally, the generating unit 1002 is specifically used to construct an energy function based on the visualization matrix and the observation matrix, the energy function is used to represent the difference between the estimated value and the observed value of each point on the surface of the object, the estimated value is related to the visualization matrix, and the observed value is related to the observation matrix;

Optionally, the generating unit 1002 is specifically configured to minimize the value of the energy function to obtain parameter information.

In this embodiment, the operations performed by each unit in the image processing device are similar to those described in the embodiments shown in the above-mentioned Figures 1 to 5, and will not be repeated here.

In this embodiment, on the one hand, without adding additional light sources, by changing the coding strategy of structured light, the projection device is turned into a light source, which can support PBR material map output. On the other hand, the material map includes diffuse reflection map, specular reflection map, roughness map and normal map, which supports PBR rendering. On the other hand, the above-mentioned material modeling and solution algorithm can be used for mobile phone material measurement, providing a material modeling solution for the existing mobile phone-based 3D reconstruction algorithm, and supporting the material output function of the mobile phone.

Referring to FIG. 11 , a schematic diagram of the structure of another image processing device provided by the present application. The image processing device may include a processor 1101, a memory 1102, and a communication port 1103. The processor 1101, the memory 1102, and the communication port 1103 are interconnected via a line. The memory 1102 stores program instructions and data.

The memory 1102 stores program instructions and data corresponding to the steps executed by the image processing device in the corresponding implementation modes shown in the aforementioned FIGS. 1 to 5 .

The processor 1101 is used to execute the steps performed by the image processing device shown in any of the embodiments shown in Figures 1 to 5 above.

The communication port 1103 can be used to receive and send data, and to execute the steps related to acquisition, sending, and receiving in any of the embodiments shown in FIG. 1 to FIG. 5 .

In one implementation, the image processing device may include more or fewer components than those in FIG. 11 , and this application is merely an illustrative description and is not intended to be limiting.

An embodiment of the present application further provides a computer-readable storage medium storing one or more computer-executable instructions. When the computer-executable instructions are executed by a processor, the processor executes the method described in the possible implementation manner of the image processing device in the aforementioned embodiment.

An embodiment of the present application also provides a computer program product (or computer program) storing one or more computers. When the computer program product is executed by the processor, the processor executes the method of the possible implementation mode of the above-mentioned image processing device.

The embodiment of the present application also provides a chip system, which includes at least one processor for supporting a terminal device to implement the functions involved in the possible implementation of the above-mentioned image processing device. Optionally, the chip system also includes an interface circuit, which provides program instructions and/or data for the at least one processor. In one possible design, the chip system may also include a memory, which is used to store the necessary program instructions and data for the image processing device. The chip system may be composed of chips, or may include chips and other discrete devices.

Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices and units described above can refer to the corresponding processes in the aforementioned method embodiments and will not be repeated here.

In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be an indirect coupling or communication connection through some interfaces, devices or units, which can be electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including a number of instructions to enable a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, read-only memory), random access memory (RAM, random access memory), disk or optical disk and other media that can store program code.

Claims

An image processing method, characterized in that the method comprises:

Acquire at least three image groups, wherein the at least three image groups are reflection images of the object surface with respect to the material coding pattern at at least three viewing angles; the at least three image groups correspond one-to-one to the at least three viewing angles;

Acquire an initial depth of the object surface corresponding to any one of the at least three image groups;

Parameter information of the object is generated based on the at least three image groups and the initial depth, where the parameter information includes material mapping parameters and/or geometric structure parameters.
The method according to claim 1, characterized in that the method is applied to a structured light system, the structured light system comprising a camera, a projection device, a rotating device, and an object connected to the rotating device, the rotating device being used to rotate multiple times so that the object is located at the at least three viewing angles;

The acquiring of at least three image groups comprises:

triggering the projection device to project a material coding pattern onto the object at each of the at least three viewing angles;

triggering the camera to collect an image group reflected by the object with respect to the material coding pattern at each viewing angle to obtain the at least three image groups;

The initial depth is obtained by projecting a structured light coding pattern onto the object by the projection device.
The method according to claim 1 or 2 is characterized in that the material coding pattern includes a full black pattern and a full white pattern, and each of the at least three image groups includes two reflection images.
The method according to any one of claims 1 to 3, characterized in that the generating the parameter information of the object based on the at least three images and the initial depth comprises:

Obtaining occlusion information of the spatial point cloud of the surface of the object in the target image group and corresponding spatial point clouds in two image groups, wherein any one of the image groups is the target image group, and the two image groups are two image groups other than the target image group among the at least three image groups;

Eliminating pixel values corresponding to the occlusion information in the two image groups to obtain a visualization matrix;

Based on the pose calibration information and the initialization depth, an observation matrix of each point on the surface of the object under the at least three viewing angles is obtained, wherein the observation matrix includes: the incident light direction, the reflected light direction, and the pixel observation values of the light source intensity under the at least three viewing angles; the pose calibration information includes: the intrinsic parameters of the projection device and the camera, and the extrinsic parameters between the projection device and the camera; the projection device is used to project the material coding pattern, and the camera is used to collect the at least three image groups;

The parameter information is determined based on the visualization matrix and the observation matrix.
The method according to claim 4, characterized in that the determining the parameter information based on the visualization matrix and the observation matrix comprises:

constructing an energy function based on the visualization matrix and the observation matrix, the energy function being used to represent the difference between the estimated value and the observed value of each point on the surface of the object, the estimated value being related to the visualization matrix, and the observed value being related to the observation matrix;

The value of the energy function is minimized to obtain the parameter information.
The method according to claim 5, characterized in that the parameter information includes: the material mapping parameters and/or the geometric structure parameters, the geometric structure parameters include the optimized depth or the initialized depth; the energy function is as shown in Formula 1:
Formula 1:

The material mapping parameters include is the diffuse reflection variable; is the specular reflection variable; is the roughness; z * is the geometric structure parameter; is the estimated value of any point on the surface of the object at different viewing angles, and the calculation method of the estimated value is shown in Formula 2, where I i is the observation value of any point at any viewing angle obtained by the camera; i is the number of different viewing angles; E is the regularization term;
Formula 2:

Wherein, E i is the light source intensity of any point at any viewing angle, d is the distance between any point and the projector, f() is the reflection characteristic function, and the reflection characteristic function is shown in Formula 3, n is the surface normal vector of any point, l i is the incident light direction of any point at any viewing angle, and vi is the reflected light direction of any point at any viewing angle;
Formula 3:

in, is the initial diffuse reflectance variable; is the initial specular reflection variable; rs is the initial roughness; D() represents the microfacet distribution function, and G() represents the geometric attenuation coefficient.
An image processing device, characterized in that the image processing device comprises:

An acquisition unit, configured to acquire at least three image groups, wherein the at least three image groups are reflection images of the object surface with respect to the material coding pattern at at least three viewing angles; the at least three image groups correspond one-to-one to the at least three viewing angles;

The acquisition unit is further used to acquire an initial depth of the object surface corresponding to any one of the at least three image groups;

A generating unit is used to generate parameter information of the object based on the at least three image groups and the initial depth, wherein the parameter information includes material mapping parameters and/or geometric structure parameters.
The image processing device according to claim 7, characterized in that the image processing device is applied to a structured light system, the structured light system comprises a camera, a projection device, a rotating device, and an object connected to the rotating device, the rotating device is used to rotate multiple times so that the object is located in the at least three viewing angles;

The acquisition unit is specifically configured to trigger the projection device to project a material coding pattern onto the object at each of the at least three viewing angles;

The acquisition unit is specifically configured to trigger the camera to collect an image group reflected by the object with respect to the material coding pattern at each viewing angle, so as to acquire the at least three image groups;

The initial depth is obtained by projecting a structured light coding pattern onto the object by the projection device.
The image processing device according to claim 7 or 8 is characterized in that the material coding pattern includes a full black pattern and a full white pattern, and each of the at least three image groups includes two reflection images.
The image processing device according to any one of claims 7 to 9, characterized in that the generating unit is specifically used to obtain occlusion information of the spatial point cloud of the surface of the object in the target image group and the corresponding spatial point cloud in two image groups, wherein any one of the image groups is the target image group, and the two image groups are two image groups other than the target image group among the at least three image groups;

The generating unit is specifically used to remove the pixel values corresponding to the occlusion information in the two image groups to obtain a visualization matrix;

The generation unit is specifically used to obtain the observation matrix of each point on the surface of the object under the at least three viewing angles based on the posture calibration information and the initialization depth, the observation matrix includes: the incident light direction, the reflected light direction, the pixel observation value of the light source intensity under the at least three viewing angles, the posture calibration information includes: the intrinsic parameters of the projection device and the camera, the extrinsic parameters between the projection device and the camera, the projection device is used to project the material coding pattern, and the camera is used to collect the at least three image groups;

The generating unit is specifically configured to determine the parameter information based on the visualization matrix and the observation matrix.
The image processing device according to claim 10, characterized in that the generating unit is specifically used to construct an energy function based on the visualization matrix and the observation matrix, the energy function is used to represent the difference between the estimated value and the observed value of each point on the surface of the object, the estimated value is related to the visualization matrix, and the observed value is related to the observation matrix;

The generating unit is specifically used to minimize the value of the energy function to obtain the parameter information.
The image processing device according to claim 11, characterized in that the parameter information includes: the material mapping parameters and/or the geometric structure parameters, the geometric structure parameters include the optimized depth or the initialized depth; the energy function is as shown in Formula 1:
Formula 1:

The material mapping parameters include is the diffuse reflection variable; is the specular reflection variable; is the roughness; z * is the geometric structure parameter; is the estimated value of any point on the surface of the object at different viewing angles, and the calculation method of the estimated value is shown in Formula 2, where I i is the observation value of any point at any viewing angle obtained by the camera; i is the number of different viewing angles; and E is a regularization term;
Formula 2:

Wherein, E i is the light source intensity of any point at any viewing angle, d is the distance between any point and the projector, f() is the reflection characteristic function, and the reflection characteristic function is shown in Formula 3, n is the surface normal vector of any point, l i is the incident light direction of any point at any viewing angle, and vi is the reflected light direction of any point at any viewing angle;
Formula 3:

in, is the initial diffuse reflectance variable; is the initial specular reflection variable; rs is the initial roughness; D() represents the microfacet distribution function, and G() represents the geometric attenuation coefficient.
A structured light system, characterized in that the structured light system comprises: a camera, a projection device, a rotating device and an object;

The camera is used to collect reflection images of the object at different viewing angles with respect to the material coding pattern, and the material coding pattern is used to obtain material mapping parameters of the object;

The projection device is used to project the material coding pattern onto the surface of the object;

The rotating device is used to rotate the object so that the object is located at the different viewing angles.
The structured light system according to claim 13, characterized in that the material coding pattern includes an all-black pattern and an all-white pattern, and the number of reflected images at each of the different viewing angles is two.
An image processing device, characterized in that it comprises: a processor, the processor is coupled to a memory, the memory is used to store programs or instructions, when the program or instructions are executed by the processor, the image processing device executes the method as described in any one of claims 1 to 6.
A computer-readable storage medium, characterized in that the medium stores instructions, and when the instructions are executed by a computer, the method according to any one of claims 1 to 6 is implemented.
A computer program product, characterized in that it comprises instructions, and when the instructions are executed on a computer, the computer is caused to execute the method according to any one of claims 1 to 6.