WO2024041181A1

WO2024041181A1 - Image processing method and apparatus, and storage medium

Info

Publication number: WO2024041181A1
Application number: PCT/CN2023/103359
Authority: WO
Inventors: 杜明; 郑佳; 周子寒
Original assignee: 杭州群核信息技术有限公司
Priority date: 2022-08-26
Filing date: 2023-06-28
Publication date: 2024-02-29
Also published as: CN115359169A

Abstract

The present disclosure relates to the technical field of image processing, and provides an image processing method and apparatus, and a storage medium. The image processing method comprises: obtaining a spatial distribution map of a target space acquired by an image acquisition device; identifying a target region in the spatial distribution map, wherein the target region is associated with a region to be designed in the target space; obtaining a target pattern; and replacing the pattern of the target region with the target pattern.

Description

Image processing method, device and storage medium

Technical field

The present disclosure relates to an image processing method, device and storage medium, and belongs to the technical field of image processing.

Background technique

In the scenarios of old house renovation, partial house design, and carpet and tile sales, customers often want to replace the texture materials of parts of the room and expect to quickly get the replacement renderings. Related technical processing methods include: obtaining a floor plan, and using three-dimensional modeling and rendering based on the floor plan to simulate the display effects of floors of different materials and textures in the user's room.

Contents of the invention

A first aspect of an embodiment of the present disclosure provides an image processing method, including: obtaining a spatial distribution map of a target space collected by an image acquisition device; identifying a target area in the spatial distribution map, wherein the target area is The areas to be designed in the space are associated; the target pattern is obtained; and the pattern of the target area is replaced with the target pattern.

In one embodiment, identifying the target area in the spatial distribution map includes: inputting the spatial distribution map to a semantic segmentation network to identify, through the semantic segmentation network, a region category to which at least one pixel in the spatial distribution map belongs; and , determine the pixels belonging to the target area according to the area category to which the at least one pixel belongs.

In one embodiment, replacing the pattern of the target area with the target pattern includes: obtaining the correspondence between the three-dimensional spatial coordinates of the area to be designed in the target space and the two-dimensional image coordinates of the target area in the spatial distribution map; and, According to the corresponding relationship, the target pattern is projected to the target area in the spatial distribution map.

In one implementation, obtaining the correspondence between the three-dimensional spatial coordinates of the area to be designed in the target space and the two-dimensional image coordinates of the target area in the spatial distribution map includes: detecting the coordinates of the target area in the spatial distribution map through a vanishing point detection algorithm. the vanishing point; determine the focal length of the image acquisition device and the rotation matrix of the camera coordinate system and the world coordinate system based on the detected vanishing point; and determine the corresponding relationship based on the focal length and rotation matrix.

In one embodiment, projecting the target pattern to the target area in the spatial distribution map according to the corresponding relationship includes: performing projection transformation on the target pattern according to the corresponding relationship to obtain the texture foreground of the target area; converting the spatial distribution map The brightness information is synthesized into the texture foreground; and the texture foreground after the brightness information is synthesized is synthesized with the spatial distribution map to obtain the target spatial distribution map.

In one embodiment, synthesizing the texture foreground after synthesizing the brightness information and the spatial distribution map includes: determining the proportional relationship between the foreground and the background in the spatial distribution map; and, according to the proportional relationship, synthesizing the brightness information into the texture foreground. The texture foreground is synthesized with the spatial distribution map.

In one implementation, determining the proportional relationship between the foreground and the background in the spatial distribution map includes: obtaining a pixel mask map of the spatial distribution map; and inputting the spatial distribution map and the pixel mask map into a matting neural network. , to determine the proportional relationship through the matting neural network.

In one implementation, inputting the spatial distribution map and the pixel mask map to the matting neural network includes: performing image enhancement on the pixel mask map; and inputting the spatial distribution map and the enhanced pixel mask map into Cutout neural network.

In one embodiment, performing image enhancement on the pixel mask map includes: extracting the topological skeleton of the pixel mask map; performing dilation and erosion on the pixel mask map; and superimposing the topological skeleton and the pixel mask map after dilation and erosion. , to obtain the enhanced pixel mask map.

A second aspect of the embodiment of the present disclosure provides an image processing device, including a memory and a processor. The memory stores at least one program instruction. The processor loads and executes the at least one program instruction to implement the third embodiment of the present disclosure. An aspect provides image processing methods.

A third aspect of an embodiment of the present disclosure provides a computer-readable storage medium. The computer-readable storage medium stores at least one program instruction. When the at least one program instruction is loaded and executed by a processor, the third embodiment of the present disclosure is implemented. An aspect provides image processing methods.

Description of drawings

Figure 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.

Figure 2 is a possible schematic diagram of the obtained spatial distribution map provided by the embodiment of the present disclosure.

Figure 3 is a mask map generated based on the spatial distribution map shown in Figure 2 provided by an embodiment of the present disclosure.

FIG. 4 is a possible schematic diagram of a user-selected target pattern provided by an embodiment of the present disclosure.

FIG. 5 is a possible schematic diagram of a texture foreground obtained by projecting the target pattern shown in FIG. 4 provided by an embodiment of the present disclosure.

FIG. 6 is a schematic diagram of the principle of the proportional relationship between the foreground and the background provided by an embodiment of the present disclosure.

FIG. 7 is a schematic diagram of the proportional relationship between the foreground and the background of the spatial distribution diagram shown in FIG. 2 provided by an embodiment of the present disclosure.

FIG. 8 is a target spatial distribution diagram obtained by replacing the ground in the spatial distribution diagram shown in FIG. 2 with the target pattern shown in FIG. 4 provided by an embodiment of the present disclosure.

Detailed ways

The technical solutions of the present disclosure will be clearly and completely described below in conjunction with the accompanying drawings. Obviously, the described embodiments are some of the embodiments of the present disclosure, rather than all of the embodiments. Based on the embodiments in the present disclosure, all other embodiments obtained by those of ordinary skill in the art without making creative efforts fall within the protection scope of the present disclosure.

In the description of the present disclosure, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings. It is only for the convenience of describing the present disclosure and simplifying the description. It does not indicate or imply that the indicated device or element must have a specific orientation or a specific orientation. construction and operation, and therefore should not be construed as limitations on the present disclosure. Furthermore, the terms “first”, “second” and “third” are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present disclosure, it should be noted that, unless otherwise clearly stated and limited, the terms "installation", "connection" and "connection" should be understood in a broad sense. For example, it can be a fixed connection or a detachable connection. To connect, or to join in one piece; may be mechanical, It can also be an electrical connection; it can be a direct connection, or it can be an indirect connection through an intermediate medium, or it can be an internal connection between two components. For those of ordinary skill in the art, the specific meanings of the above terms in this disclosure can be understood on a case-by-case basis.

In addition, the technical features involved in different embodiments of the present disclosure described below can be combined with each other as long as they do not conflict with each other.

In related technologies, the display effect of floors/walls of different materials and textures in the user's room is usually simulated through three-dimensional modeling and rendering based on the floor plan. However, this method is difficult to display the effect of existing soft furnishings in the user's home. Users can only imagine the effect of the actual floor/wall combined with soft furnishings through inaccurate display effects, which often leads to the failure of the actual decoration after completion. It was found that the situation was inconsistent with the original expectations, that is, the accuracy of the replacement effect generated by the above method was poor.

In order to solve the above technical problems, embodiments of the present disclosure provide an image processing method. Referring to FIG. 1 , a schematic flowchart of an image processing method provided by an embodiment of the present disclosure is shown. The image processing method includes S101, S102, S103, and S104.

In S101, obtain the spatial distribution map of the target space collected by the image acquisition device.

When the user needs to change the decoration design of a certain space and preview the changed design effect, the user can take photos of the target space through an image acquisition device to obtain a spatial distribution map. For example, when the user wants to change the style of floor tiles in the room, the user can take a picture including the floor of the room; and for example, when the user wants to change the style of wall covering in the living room, the user can take a picture including the wall of the living room, etc., this disclosure The embodiment does not limit this.

The image acquisition device can be a digital camera, a camera in a mobile phone, a camera in a computer, etc. That is to say, the image acquisition device in the embodiment of the present disclosure can be any form of device that can collect pictures. The embodiment of the present disclosure is specific to it. The form is not limited.

For example, if the user wants to change the floor tile texture of the room, the user can take a picture of the room as shown in Figure 2. Correspondingly, the image processing device can obtain the captured room picture.

In one implementation, the spatial distribution map may also be a locally pre-stored photo or a photo from the Internet, which is not limited in this embodiment of the disclosure.

In S102, the target area in the spatial distribution map is identified, and the target area is related to the target area. associated with the area to be designed in the target space.

After obtaining the spatial distribution map, the target area in the spatial distribution map can be identified. The target area is an area where the design pattern needs to be changed, that is, the area to be designed associated with the target area can be the floor and/or wall surface in the target space that needs to be transformed, etc. The design pattern may include the color, texture, and/or pattern of the ceramic tiles, or may include the color and/or pattern of the wall covering, etc., which is not limited in the embodiments of the present disclosure.

In one embodiment, identifying the target area in the spatial distribution map may include: inputting the spatial distribution map to a semantic segmentation network to identify, through the semantic segmentation network, a region category to which at least one pixel in the spatial distribution map belongs; and , determine the pixels belonging to the target area according to the area category to which the at least one pixel belongs.

That is to say, the spatial distribution map can be input to the semantic segmentation network, and the regional category to which each pixel in the spatial distribution map belongs is identified through the semantic segmentation network.

The semantic segmentation network is a pre-trained and stored network model. After inputting the spatial distribution map into the semantic segmentation network, the semantic segmentation network can output a matrix of H*W*L, where W and H represent the length and width of the matrix, corresponding to each pixel in the spatial distribution map. . There is a vector V of length L at each pixel position, which represents the probability that the pixel belongs to L categories. The L categories are L objects included in the spatial distribution map, and L is a positive integer greater than or equal to 1. For example, for the spatial distribution map shown in Figure 2, the L categories can be floor, wall, cabinet, and table. In practical applications, the L categories can also include target areas and non-target areas. For example, for the spatial distribution map shown in Figure 2, the L categories can include ground and non-ground.

Afterwards, each pixel belonging to the target area can be determined according to the area category to which each pixel belongs, and then the target area can be identified. After the semantic segmentation network outputs the above matrix, each pixel belonging to the target area can be determined according to the area category to which each pixel belongs, and then the area composed of each determined pixel is identified as the target area.

In one implementation, for any pixel, the area category corresponding to the maximum probability value in the corresponding vector V of length L can be determined to the area category to which the pixel belongs. For example, for a certain pixel, the vector V in the matrix is (ground: 0.7; non-ground: 0.3), then the pixel can be identified as a pixel on the ground.

In one implementation, a mask map can be generated based on the above matrix. In the mask map Including each area after the above identification. For example, for the spatial distribution map shown in Figure 2, the mask map shown in Figure 3 can be generated.

In S103, the target pattern is obtained.

The target pattern can be a graphic, pattern and/or pattern set by the user. In one implementation, the user can select the target pattern from the pattern library, or upload the target pattern through a local file, which is not limited in the embodiment of the present disclosure. For example, please refer to FIG. 4 , which shows a possible schematic diagram of a target pattern selected by the user.

In S104, the pattern of the target area is replaced with the target pattern.

After determining the target area and the target pattern of the target area, the content in the target area can be replaced with the target pattern.

In one embodiment, replacing the pattern of the target area with the target pattern may include: obtaining the correspondence between the three-dimensional spatial coordinates of the area to be designed in the target space and the two-dimensional image coordinates of the target area in the spatial distribution map; and , according to the corresponding relationship, project the target pattern to the target area in the spatial distribution map.

In order to obtain the corresponding relationship between the three-dimensional spatial coordinates of the area to be designed and the two-dimensional image coordinates of the target area in the spatial distribution map, since the camera coordinate system of the image acquisition device to obtain the spatial distribution map is different from the world coordinate system in the three-dimensional space, therefore In order to ensure the accuracy of the replaced target space distribution map, embodiments of the present disclosure can obtain the corresponding relationship between the three-dimensional spatial coordinates of the area to be designed and the two-dimensional image coordinates of the target area.

In one embodiment, obtaining the corresponding relationship between the three-dimensional spatial coordinates of the area to be designed in the target space and the two-dimensional image coordinates of the target area in the spatial distribution map may include: detecting the spatial distribution map through a vanishing point detection algorithm. the vanishing point in; determine the focal length of the image acquisition device and the rotation matrix of the camera coordinate system and the world coordinate system based on the detected vanishing point; and determine the corresponding relationship based on the focal length and rotation matrix.

That is to say, the vanishing point in the spatial distribution map can be detected through the vanishing point detection algorithm, and then the focal length of the image acquisition device and the rotation matrix of the camera coordinate system and the world coordinate system are calculated based on the detected vanishing point; and then based on the calculated The focal length and rotation matrix determine the correspondence.

In one implementation, the focal length of the image capture device may be calculated based on the detected first vanishing point and the second vanishing point that is perpendicular to it (for example, a horizontal vanishing point and a vanishing point that is perpendicular to it). Specifically, it is assumed that the origin of the camera coordinate system is The origins of the coordinate system coincide with each other, and the camera internal parameter matrix is The rotation matrix between the camera coordinate system and the world coordinate system is R=[r ₁ r ₂ r ₃ ], and the displacement is T=0. Taking the first vanishing point v ₁ as an example, it satisfies:

Using the orthogonality of the rotation matrix, two mutually perpendicular vanishing points v _i and v _j are selected. From r _i ^T r _j = 0, it can be derived The focal length f of the image acquisition device can be obtained by solving the problem.

For the rotation matrix, it can be restored through at least two vanishing points, which is not limited in this embodiment of the disclosure. Specifically, it can be known from the above formula: r ₁ =K ^-1 v ₁ .

After that, r ₂ and r ₃ can be calculated using similar calculation methods, and then the rotation matrix can be restored.

In one implementation, the corresponding relationship between the three-dimensional spatial coordinate P of the area to be designed (for example, the ground) and the 2D pixel coordinate (that is, the two-dimensional image coordinate) p of the target area in the spatial distribution map is: p =K[RT]P=K[r ₁ r ₂ r ₃ ]P.

After obtaining the above corresponding relationship, the target pattern can be projected to the target area in the spatial distribution map.

In one embodiment, projecting the target pattern to the target area in the spatial distribution map according to the corresponding relationship may include: performing projection transformation on the target pattern according to the corresponding relationship to obtain the texture foreground of the target area; converting the spatial distribution into The brightness information of the image is synthesized into the texture foreground; and the texture foreground after synthesizing the brightness information is synthesized with the spatial distribution map to obtain the target spatial distribution map.

The target pattern is projected and transformed according to the corresponding relationship to obtain the texture foreground of the target area. That is, the target pattern is projected and transformed according to the above corresponding relationship to obtain the texture foreground of the target area that conforms to the perspective relationship. For example, please refer to Figure 5. After projecting the target pattern shown in Figure 4, the texture foreground shown in Figure 5 can be obtained.

In one implementation, synthesizing the brightness information of the spatial distribution map into the texture foreground may include: synthesizing the brightness information of the spatial distribution map into the texture foreground obtained through projection transformation according to transparency (alpha channel) to simulate illumination and shadow.

In one embodiment, in order to further improve the replacement effect of the target spatial distribution map, the texture foreground after synthesizing the brightness information is synthesized with the spatial distribution map to obtain the target spatial distribution map, which may include: determining the The proportional relationship between the foreground and the background; and, according to the proportional relationship, the texture foreground and the spatial distribution map after synthesizing the brightness information are synthesized.

In one embodiment, determining the proportional relationship between the foreground and the background in the spatial distribution map may include: obtaining a pixel mask map of the spatial distribution map; and inputting the spatial distribution map and the pixel mask map to the matting neural network. network to determine proportional relationships through matting neural networks.

In one implementation, the pixel mask map may be the mask map obtained through the semantic segmentation network in S102. That is, obtaining the pixel mask map of the spatial distribution map may include: reading the mask obtained in S102. Code map.

In one implementation, inputting the spatial distribution map and the pixel mask map into the matting neural network may include: performing image enhancement on the pixel mask map; and inputting the spatial distribution map and the enhanced pixel mask map. To the cutout neural network.

That is to say, in practical applications, before inputting the spatial distribution map and pixel mask map into the matting neural network, the pixel mask map can be image enhanced first, and then the enhanced pixel mask map and spatial The distribution map is input to the matting neural network.

Specifically, performing image enhancement on the pixel mask map may include: extracting the topological skeleton of the pixel mask map; performing dilation and erosion on the pixel mask map; and superimposing the topological skeleton and the pixel mask map after dilation and erosion to obtain Enhanced pixel mask image.

In practical applications, the pixel mask map can also be image enhanced through other image enhancement methods, and the embodiments of the present disclosure do not limit its specific implementation.

Any picture C can be regarded as the linear addition of two images, foreground F and background B, through the alpha channel. For example, please refer to Figure 6. In the embodiment of the present disclosure, the proportional relationship between the foreground and the background refers to the value of α, which determines the proportion of foreground pixels and background pixels that each pixel of the picture is synthesized from.

For example, after inputting the enhanced result of the mask map as shown in Figure 3 and the spatial distribution map as shown in Figure 2 to the matting neural network, the proportional relationship between the foreground and the background as shown in Figure 7 can be output.

After synthesizing the texture foreground and spatial distribution map after synthesizing the brightness information according to the proportional relationship, the target spatial distribution map can be obtained. In the above example, after replacing the ground of the spatial distribution map shown in Figure 2 with the target pattern shown in Figure 4 (that is, the texture foreground shown in Figure 5 after synthesizing the brightness information), you can The target spatial distribution map shown in Figure 8 is obtained.

To sum up, by obtaining the spatial distribution map of the target space collected through the image acquisition device, identifying the target area in the spatial distribution map, obtaining the target pattern, and replacing the pattern of the target area with the target pattern, the related technology is solved In order to solve the problem of poor replacement effect due to the lack of actual scene information, the replacement can be combined with the spatial distribution map obtained by shooting to improve the accuracy of the replacement result. In addition, since the embodiment of the present disclosure performs replacement through the photographed spatial distribution map, the user can intuitively view the overall effect of the replacement result, which improves the user experience.

In addition, the embodiment of the present disclosure uses a semantic segmentation network to identify the target area, so that the replacement result can be obtained in a short time, shortening the time consumption of pattern replacement, and improving the efficiency of replacement.

An embodiment of the present disclosure also provides an image processing device, including a memory and a processor. At least one program instruction is stored in the memory, and the processor loads and executes the at least one program instruction to implement the method as described above.

Embodiments of the present disclosure also provide a non-transitory computer-readable storage medium, in which at least one program instruction is stored, and the at least one program instruction is loaded and executed by the processor to implement the method as described above.

The technical features of the above-described embodiments can be combined in any way. To simplify the description, not all possible combinations of the technical features in the above-described embodiments are described. However, as long as there is no contradiction in the combination of these technical features, All should be considered to be within the scope of this manual.

The above-described embodiments only express several implementation modes of the present disclosure, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the present disclosure. It should be noted that, for those of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present invention, and these all fall within the protection scope of the present disclosure. Therefore, the scope of protection of the present disclosure should be determined by the appended claims.

Claims

An image processing method including:

Obtain the spatial distribution map of the target space collected by the image acquisition device;

identifying a target area in the spatial distribution map, wherein the target area is associated with an area to be designed in the target space;

Obtain the target pattern; and

Replace the pattern of the target area with the target pattern.
The method of claim 1, wherein identifying the target area in the spatial distribution map includes:

inputting the spatial distribution map to a semantic segmentation network to identify, through the semantic segmentation network, a region category to which at least one pixel in the spatial distribution map belongs; and

The pixels belonging to the target area are determined according to the area category to which the at least one pixel belongs.
The method according to claim 1, wherein replacing the pattern of the target area with the target pattern includes:

Obtain the corresponding relationship between the three-dimensional spatial coordinates of the area to be designed in the target space and the two-dimensional image coordinates of the target area in the spatial distribution map; and

According to the corresponding relationship, the target pattern is projected to the target area in the spatial distribution map.
The method according to claim 3, wherein obtaining the corresponding relationship between the three-dimensional spatial coordinates of the area to be designed in the target space and the two-dimensional image coordinates of the target area in the spatial distribution map includes:

Detect the vanishing point in the spatial distribution map through the vanishing point detection algorithm;

Determine the focal length of the image acquisition device and the rotation matrix of the camera coordinate system and the world coordinate system according to the detected vanishing point; and

The corresponding relationship is determined based on the focal length and the rotation matrix.
The method of claim 3, wherein projecting the target pattern to the target area in the spatial distribution map according to the corresponding relationship includes:

According to the corresponding relationship, perform projection transformation on the target pattern to obtain the texture foreground of the target area;

Synthesize the brightness information of the spatial distribution map to the texture foreground; and

The texture foreground after synthesizing the brightness information is synthesized with the spatial distribution map to obtain the target spatial distribution map.
The method according to claim 5, wherein synthesizing the texture foreground after synthesizing the brightness information and the spatial distribution map includes:

determining the proportional relationship between foreground and background in the spatial distribution map; and

According to the proportional relationship, the texture foreground after synthesizing the brightness information is synthesized with the spatial distribution map.
The method of claim 6, wherein determining the proportional relationship between the foreground and the background in the spatial distribution map includes:

Obtain a pixel mask map of the spatial distribution map; and

The spatial distribution map and the pixel mask map are input to a matting neural network to determine the proportional relationship through the matting neural network.
The method according to claim 7, wherein inputting the spatial distribution map and the pixel mask map to the matting neural network includes:

Perform image enhancement on the pixel mask map; and

The spatial distribution map and the enhanced pixel mask map are input to the matting neural network.
The method according to claim 8, wherein performing image enhancement on the pixel mask map includes:

Extract the topological skeleton of the pixel mask map;

Perform dilation and erosion on the pixel mask map; and

The topological skeleton and the pixel mask map after dilation and erosion are superimposed to obtain the enhanced pixel mask map.
An image processing device, including a memory and a processor. At least one program instruction is stored in the memory. The processor loads and executes the at least one program instruction to implement the method according to any one of claims 1 to 9. Methods.
A computer-readable storage medium stores at least one program instruction. When the at least one program instruction is loaded and executed by a processor, the method according to any one of claims 1 to 9 is implemented.