CN117115287A

CN117115287A - Image generation method, device, electronic equipment and readable storage medium

Info

Publication number: CN117115287A
Application number: CN202311079293.2A
Authority: CN
Inventors: 张昊若
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2023-08-25
Filing date: 2023-08-25
Publication date: 2023-11-24

Abstract

The application discloses an image generation method, an image generation device, electronic equipment and a readable storage medium, and belongs to the technical field of image processing. The image generation method provided by the embodiment of the application comprises the following steps: transmitting a first feature map corresponding to the first image to the second electronic equipment; the first image includes: a foreground image and a first background image; the first feature map is used for generating a second feature map by the second electronic equipment; receiving a second feature map from the second electronic device and generating a second background image according to the second feature map; the second background image and the first background image have different image contents; a second image is generated based on the foreground image and the second background image.

Description

Image generation method, device, electronic equipment and readable storage medium

Technical Field

The application belongs to the technical field of image processing, and particularly relates to an image generation method, an image generation device, electronic equipment and a readable storage medium.

Background

Currently, in order to avoid disclosure of user privacy, an artificial intelligence generation content (Artificial Intelligence Generated Content, AIGC) large model may be configured in an electronic device, so that when a user needs to generate a user image with richer content based on a photographed user image, the user does not need to send the photographed user image to other devices, but the electronic device may be directly triggered to input the photographed user image into the AIGC large model, so that the AIGC large model may process the photographed user image to generate the user image with richer content of the user need, so that the user may obtain the user image of the need.

However, since the electronic device inputs the photographed user image into the AIGC large model, the AIGC large model may require a long time to perform the processing of the photographed user image to generate the user image with a richer content of the user's demand, it may take a long time in generating the user image of the user's demand, and thus, the electronic device may have a low efficiency in generating the image of the user's demand.

Disclosure of Invention

An embodiment of the application aims to provide an image generation method, an image generation device, electronic equipment and a readable storage medium, which can solve the problem that the efficiency of generating an image required by a user by the electronic equipment is low.

In a first aspect, an embodiment of the present application provides an image generating method, which is applied to a first electronic device, and includes: transmitting a first feature map corresponding to the first image to the second electronic equipment; the first image includes: a foreground image and a first background image; the first feature map is used for generating a second feature map by the second electronic equipment; receiving a second feature map from the second electronic device and generating a second background image according to the second feature map; the second background image and the first background image have different image contents; a second image is generated based on the foreground image and the second background image.

In a second aspect, an embodiment of the present application provides an image generating method, applied to a second electronic device, including:

receiving a first feature map corresponding to a first image from first electronic equipment; the first image includes: a foreground image and a first background image; inputting the first feature map into a first model to obtain a second feature map output by the first model; the second feature map is generated by processing the first feature map by the first model; transmitting the second feature map to the first electronic device; the second feature map is used for generating a second background image by the first electronic device, and generating a second image based on the foreground image and the second background image; the second background image and the first background image have different image contents.

In a third aspect, an embodiment of the present application provides an image generating apparatus, which is a first image generating apparatus including: the device comprises a sending module, a receiving module and a processing module. The sending module is used for sending the first feature map corresponding to the first image to the second image generating device; the first image includes: a foreground image and a first background image; the first feature map is used by the second image generating means to generate a second feature map. And a receiving module for receiving the second feature map from the second image generating device. The processing module is used for generating a second background image according to the second feature image received by the receiving module; the second background image and the first background image have different image contents; and generating a second image based on the foreground image and the second background image.

In a fourth aspect, an embodiment of the present application provides an image generating apparatus, which is a second image generating apparatus including: the device comprises a receiving module, a processing module and a sending module. The receiving module is used for receiving a first feature map corresponding to the first image from the first image generating device; the first image includes: a foreground image and a first background image. The processing module is used for inputting the first feature map received by the receiving module into the first model to obtain a second feature map output by the first model; the second feature map is generated by processing the first feature map by the first model. The sending module is used for sending the second feature map generated by the processing module to the first image generating device; the second feature map is used for generating a second background image by the first image generating device, and generating a second image based on the foreground image and the second background image, wherein the second background image and the first background image are different in image content.

In a fifth aspect, an embodiment of the present application provides an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, the program or instructions implementing the steps of the method as described in the first aspect or the steps of the method as described in the second aspect when executed by the processor.

In a sixth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor, implement the steps of the method as described in the first aspect, or implement the steps of the method as described in the second aspect.

In a seventh aspect, embodiments of the present application provide a chip, the chip including a processor and a communication interface, the communication interface being coupled to the processor, the processor being configured to execute a program or instructions, to implement the steps of the method according to the first aspect, or to implement the steps of the method according to the second aspect.

In an eighth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to perform the steps of the method as described in the first aspect, or to perform the steps of the method as described in the second aspect.

In the embodiment of the application, the first electronic device may send a first feature map corresponding to the first image to the second electronic device, where the first image includes a foreground image and a first background image, and the first feature map is used for the second electronic device to generate a second feature map, so that the second electronic device may receive the first feature map, input the first feature map to the first model, and obtain a second feature map generated by processing the first feature map by the first model, so that the second electronic device may send the second feature map to the first electronic device, so that the first electronic device may generate a second background image according to the second feature map, where the second background image and the first background image have different image contents, and generate the second image based on the foreground image and the second background image. Because the first electronic device can send the first feature map to the second electronic device with larger calculation force, after the second electronic device inputs the first feature map to the first model, the first model can complete the processing of the first feature map only by calculating for a short time so as to generate the second feature map, so that the first electronic device can quickly generate a second background image with different image content from the first background image according to the second feature map without calculating by the first electronic device so as to generate the second background image, the time for generating the second background image can be reduced, and the first electronic device can quickly generate the second image required by a user based on the foreground image and the second background image; and, because the transmission between the first electronic device and the second electronic device is the first feature map and the second feature map, but not the first image and the second background image, in the process of generating the second image required by the user, the situation that the first image or the second background image leaks in the process of generating the second image required by the user can be avoided; therefore, the efficiency of generating the image required by the user by the first electronic equipment can be improved under the condition of avoiding the leakage of the privacy of the user.

Drawings

FIG. 1 is a schematic flow chart of an image generating method according to an embodiment of the present application;

FIG. 2 is a second flowchart of an image generating method according to an embodiment of the present application;

FIG. 3 is a third flow chart of an image generating method according to an embodiment of the application;

FIG. 4 is a flowchart of an image generating method according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a first image generating apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a second image generating apparatus according to an embodiment of the present application;

fig. 7 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application;

FIG. 8 is a second schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application;

fig. 9 is a schematic hardware structure of a server according to an embodiment of the present application.

Detailed Description

The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged where appropriate so that embodiments of the application may be practiced in sequences other than those illustrated and described herein, and that the objects identified by "first," "second," etc. are generally of the same type and are not limited to the number of objects, e.g., the first image may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The terms "at least one," "at least one," and the like in the description and in the claims, mean that they encompass any one, any two, or a combination of two or more of the objects. For example, at least one of a, b, c (item) may represent: "a", "b", "c", "a and b", "a and c", "b and c" and "a, b and c", wherein a, b, c may be single or plural. Similarly, the term "at least two" means two or more, and the meaning of the expression is similar to the term "at least one".

The image generating method, the device, the electronic equipment and the readable storage medium provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.

According to the image generation method provided by the embodiment of the application, the execution subject can be the first image generation device and the second image generation device, or the first electronic equipment and the second electronic equipment, or the functional modules or entities in the first electronic equipment and the second electronic equipment. In the embodiment of the application, the image generation method provided by the embodiment of the application is described by taking the image generation method executed by the first electronic device and the second electronic device as an example.

Fig. 1 shows a flowchart of an image generating method according to an embodiment of the present application. As shown in fig. 1, the image generating method provided by the embodiment of the present application may include the following steps 101 to 106.

Step 101, the first electronic device sends a first feature map corresponding to the first image to the second electronic device.

In some embodiments of the present application, the first electronic device may be an electronic device used by a user, for example, the first electronic device may be a mobile phone, a computer, a tablet computer, or the like. The second electronic device may be any one of the following: electronic devices, servers, etc. used by other users.

In some embodiments of the present application, the computing power of the second electronic device is greater than the computing power of the first electronic device.

In some embodiments of the present application, the first image may be an image captured by the first electronic device through a camera, or an image received from another device. For example, the first image may be a 4K image obtained by the first electronic device through shooting with a camera.

In an embodiment of the present application, the first image includes: a foreground image and a first background image; the first feature map is used for generating a second feature map by the second electronic device.

It will be appreciated that the first image is made up of a foreground image and a first background image.

In some embodiments of the present application, at least one of the following may be included in the foreground image: a person image, a foreground object image. It should be noted that, the above "character image" may be understood as: in the image area where the person in the foreground image is located. The above-described "foreground image" can be understood as: in an image region where an object in a foreground image is located, the object may include at least one of: animals, scenery, etc.

In some embodiments of the present application, the first feature map may be a feature map obtained by extracting features of the first image by the first electronic device through the second model. Wherein the second model may be a convolutional network model.

A specific scheme of the first electronic device obtaining the first feature map will be illustrated below.

In some embodiments of the present application, as shown in fig. 2 in conjunction with fig. 1, before the step 101, the image generating method provided in the embodiment of the present application may further include the following step 201.

Step 201, the first electronic device inputs the first image into the encoder, and obtains a first feature map output by the encoder.

In an embodiment of the present application, the first feature map is obtained by encoding the first image by an encoder.

In some embodiments of the present application, the encoder may specifically be: the variations in the convolutional network model are derived from the encoder (VariationalAuto Encoder, VAE).

Alternatively, the VAE may be: the first electronic equipment trains a preset convolution network model. It will be appreciated that the VAE in the first electronic device is different from the VAE in the other devices, i.e. the other devices cannot decode through the VAE to obtain a visual image.

In some embodiments of the application, the first feature map is smaller in size than the first image. It will be appreciated that the amount of data required to transmit the first profile is less than the amount of data required to transmit the first image.

In the embodiment of the application, the first electronic device can encode the first image into the hidden space feature, namely the first feature map, through the VAE, and the first feature map cannot be decoded into the visible image on other devices, so that the privacy of the user can be prevented from being revealed.

As can be seen, since the first electronic device can encode the first image into the first feature map of a certain size through the encoder, after the first feature map is transmitted to the second electronic device, the first feature map cannot be decoded into a visible image on any one of the second electronic device and other devices, so that privacy leakage of the user can be avoided while transmission resources are saved, and thus, the security of the first electronic device can be improved while transmission efficiency is improved.

In some embodiments of the present application, the second feature map may be a feature map corresponding to another background image. Wherein the other background image may be a second background image in an embodiment, the other background image and the first background image having different image contents, and the image contents may include at least one of: image color bars, image styles, image scenes, etc., the image styles may include at least one of: color of the image, hue of the image, layout of the content, etc.

It will be appreciated that the other background images are similar in image structure to the first background image, and that the other background images are more rich in content.

In some embodiments of the present application, when a user needs to generate a second multimedia file with richer content based on a first multimedia file, the user may trigger the first electronic device to select the first multimedia file, so that the first electronic device may acquire a first feature map corresponding to the first image, and send the first feature map to the second electronic device through a first application program.

Alternatively, the first multimedia file may be any one of the following: the first image and the first video may be any one of multiple frame images of the first video when the first multimedia file is the first video. The first multimedia file may be a multimedia file obtained by shooting by a camera of the first electronic device or a multimedia file received from another device.

Alternatively, the second multimedia file may be any one of the following: the other image may be the second image in the following embodiment, and in the case that the second multimedia file is the second video, the other image may be one frame of image with the same play timestamp as the corresponding play timestamp of the first image in the multiple frames of images of the second video.

Alternatively, the first application may be any one of the following: chat-type applications, image processing-type applications, etc., shooting-type applications, etc.

In some embodiments of the present application, the first electronic device may further send first information to the second electronic device, the first information being used to generate the second feature map with the first feature map. Wherein the first information may include at least one of: depth map corresponding to the first image and first text data; illustratively, the first information includes a depth map and first text data corresponding to the first image. It is understood that the second electronic device may generate the second feature map based on the first information and the first feature map.

Optionally, the first information is used by the second electronic device to determine the image content of the other background image.

Optionally, in the case that the first image is an image obtained by the first electronic device through capturing by using a camera, when the first image is captured, the first electronic device may also acquire the depth map by using a Time of flight (TOF) Sensor of the first electronic device. In the case where the first image is an image received by the first electronic device from another device, the first electronic device may also receive the above-described depth map from the other device.

Alternatively, the first text data may be text data input by the user in the first electronic device.

Step 102, the second electronic device receives a first feature map corresponding to the first image from the first electronic device.

In an embodiment of the present application, the first image includes: a foreground image and a first background image.

Step 103, the second electronic device inputs the first feature map to the first model to obtain a second feature map output by the first model.

In an embodiment of the present application, the second feature map is generated by processing the first feature map by the first model.

It will be appreciated that since the computational power of the second electronic device is greater than the computational power of the first electronic device, after the second electronic device inputs the first feature map into the first model, the first model may quickly process the first feature map to quickly generate and output the second feature map.

In some embodiments of the present application, the first model may specifically be: artificial intelligence generates a content (Artificial Intelligence Generated Content, AIGC) large model.

It should be noted that, for the description of processing the first feature map to generate the second feature map for the AIGC large model, reference may be made to the specific description in the related art, and the embodiments of the present application are not repeated here.

In one example, the second electronic device may directly input the first feature map to the first model to obtain the second feature map output by the first model.

In another example, the second electronic device may first obtain a feature area corresponding to the foreground image in the first feature image in the first image, obtain a feature area corresponding to the first background image in the first feature image according to the feature area corresponding to the foreground image in the first feature image, and then input the feature area corresponding to the first background image in the first feature image to the first model to obtain the second feature image output by the first model.

Optionally, the second electronic device may detect the first feature map using a saliency target detection (Salient Object Detection, SOD) algorithm to determine a feature region of the foreground image corresponding in the first feature map. Alternatively, the second electronic device may receive information from the first electronic device indicating the position of the person image in the first image, such that the second electronic device may determine the corresponding feature region of the foreground image in the first feature map based on the information.

It should be noted that, for the above description of the SOD algorithm, reference may be made to the specific description in the related art, and the embodiments of the present application are not repeated here.

In some embodiments of the present application, the second electronic device may further receive the first information from the first electronic device, so that the second electronic device may input the first feature map and the first information to the first model, and obtain the second feature map output by the first model. Wherein the first information may include at least one of: the first information is used for determining the image content of the other background images by the first model; illustratively, the first information includes a depth map and first text data corresponding to the first image.

Optionally, in the case that the first information does not include the depth map corresponding to the first image, the second electronic device may detect the first feature map by using the monocular depth estimation module to obtain the depth map corresponding to the first image, and then input the first feature map, the depth map corresponding to the first image, and the first text data to the first model to obtain the second feature map.

It should be noted that, for the description of the depth map and the first text data, reference may be made to the specific description in the foregoing embodiments, and the embodiments of the present application are not repeated herein.

Step 104, the second electronic device sends the second feature map to the first electronic device.

In the embodiment of the application, the second feature map is used for generating a second background image by the first electronic device, and generating the second image based on the foreground image and the second background image, wherein the image content of the second background image is different from that of the first background image.

Step 105, the first electronic device receives the second feature map from the second electronic device, and generates a second background image according to the second feature map.

In the embodiment of the present application, the image content of the second background image is different from the image content of the first background image.

In some embodiments of the present application, the second background image may be an image generated by the first electronic device processing the second feature map through the second model.

A specific scheme of the first electronic device generating the second background image will be exemplified below.

In some embodiments of the present application, as shown in fig. 3 in conjunction with fig. 1, the above step 105 may be implemented specifically by the following step 105 a.

Step 105a, the first electronic device receives the second feature map from the second electronic device, and inputs the second feature map to the decoder of the first electronic device, so as to obtain a second background image output by the decoder.

In the embodiment of the present application, the second background image is obtained by decoding the second feature map by a decoder.

In some embodiments of the application, the decoder may be the VAE.

In some embodiments of the application, the second feature map has a smaller size than the second background image. It will be appreciated that the amount of data required to transmit the second signature is less than the amount of data required to transmit the second background image.

Therefore, the first electronic device can decode the second feature image into the second background image through the decoder, and the second electronic device and other devices can not decode the second feature image into the visible image, so that privacy leakage of a user can be avoided, and the safety of the first electronic device can be improved.

Step 106, the first electronic device generates a second image based on the foreground image and the second background image.

In some embodiments of the application, the first electronic device may fuse the foreground image and the second background image to generate the second image.

Optionally, the first electronic device may superimpose the foreground image on the second background image to fuse the foreground image and the second background image.

In the embodiment of the application, since the characters and objects in the foreground image of the first image are often characters and objects required by the user, that is, the characters and objects in the foreground image may not be required to be changed by the user, the first electronic device may generate the second image based on the foreground image and the second background image, where the characters and objects in the second image are identical to the characters and objects in the foreground image in the first image.

The embodiment of the application provides an image generation method, a first electronic device can send a first feature image corresponding to a first image to a second electronic device, the first image comprises a foreground image and a first background image, the first feature image is used for the second electronic device to generate a second feature image, so that the second electronic device can receive the first feature image, input the first feature image into a first model to obtain a second feature image generated by processing the first feature image by the first model, and therefore the second electronic device can send the second feature image to the first electronic device, so that the first electronic device can generate a second background image according to the second feature image, the image content of the second background image is different from that of the first background image, and the second image is generated based on the foreground image and the second background image. Because the first electronic device can send the first feature map to the second electronic device with larger calculation force, after the second electronic device inputs the first feature map to the first model, the first model can complete the processing of the first feature map only by calculating for a short time so as to generate the second feature map, so that the first electronic device can quickly generate a second background image with different image content from the first background image according to the second feature map without calculating by the first electronic device so as to generate the second background image, the time for generating the second background image can be reduced, and the first electronic device can quickly generate the second image required by a user based on the foreground image and the second background image; in addition, in the process of generating the second image required by the user, the transmission between the first electronic device and the second electronic device is the first characteristic diagram and the second characteristic diagram instead of the first image and the second background image, so that the situation that the first image or the second background image leaks in the process of generating the second image required by the user can be avoided; therefore, the efficiency of generating the image required by the user by the first electronic equipment can be improved under the condition of avoiding the leakage of the privacy of the user.

A specific scheme of the second electronic device generating the second feature map will be illustrated below.

In some embodiments of the present application, the foreground image includes a person image. Optionally, as shown in fig. 4 in conjunction with fig. 1, before the step 103, the image generating method provided in the embodiment of the present application may further include the following steps 301 to 304, and the step 103 may be specifically implemented by the following steps 103a and 103 b.

Step 301, the first electronic device determines a first mask image according to the first image.

It should be noted that, with respect to the execution sequence of step 101 and step 301, embodiments of the present application are not limited herein. In one example, step 101 may be performed first, followed by step 301; in another example, step 301 may be performed first, followed by step 101; in yet another example, step 301 may be performed at the same time as step 101 is performed. In fig. 4, step 101 is performed before step 301 is performed.

In the embodiment of the application, the first mask image indicates the position of the character image in the first image.

In some embodiments of the present application, each pixel in the first mask image corresponds to one pixel in the first image, and in the first mask image, a pixel value of a pixel corresponding to a pixel in the person image may be "1", and a pixel value of a pixel corresponding to a pixel in the other image may be "0".

In some embodiments of the present application, the first electronic device may calculate the first image by using a portrait segmentation algorithm module to determine the first mask image.

It should be noted that, for the description of the portrait segmentation algorithm, reference may be made to the specific description in the related art, and the embodiments of the present application are not repeated here.

Step 302, the first electronic device sends a first mask image to the second electronic device.

In the embodiment of the application, the first mask image is used for determining the first feature area corresponding to the foreground image in the first feature map by the second electronic device.

In the embodiment of the present application, the second feature map is generated by the second electronic device according to a second feature area corresponding to the first background image in the first feature map, where the second feature area is determined by the second electronic device according to the first feature area.

In the embodiment of the application, because the computing power of the first electronic device is limited, the first electronic device may need a long time to determine the position of the foreground image in the first image, so that the first electronic device can quickly determine the first mask image first and send the first mask image to the second electronic device, so that the second electronic device with larger computing power can determine the first characteristic region corresponding to the foreground image in the first characteristic image.

Step 303, the second electronic device receives the first mask image from the first electronic device.

Step 304, the second electronic device determines a first feature area corresponding to the foreground image in the first feature map based on the first mask image.

In some embodiments of the present application, the foreground image may include a person image and a foreground image, so that the second electronic device may determine, based on the first mask image, a feature area corresponding to the person image in the first feature map, and then detect the first feature map to determine a feature area corresponding to the foreground image in the first feature map, so that the second electronic device may determine, according to feature areas corresponding to the person image and the foreground image in the first feature map, a first feature area corresponding to the foreground image in the first feature map.

In some embodiments of the present application, the foreground image further includes a foreground image. Alternatively, the above step 304 may be specifically implemented by the following steps 304a and 304 b.

And step 304a, the second electronic device determines a third characteristic region corresponding to the character image in the first characteristic diagram from the first characteristic diagram according to the first mask image.

In some embodiments of the present application, the second electronic device may determine, according to the position of the pixel point with the pixel value of "1" in the first mask image, the position of the corresponding pixel point in the first feature map, and then determine, as the third feature area, the area where the corresponding pixel point in the first feature map is located.

In step 304b, the second electronic device determines the third feature area and a fourth feature area corresponding to the foreground image in the first feature map as the first feature area.

In some embodiments of the present application, the second electronic device may detect, through the foreground segmentation module, the first feature map by using a SOD algorithm to determine a fourth feature region, and then determine a region formed by the third feature region and the fourth feature region as the first feature region.

Therefore, the second electronic device can accurately determine the third characteristic region corresponding to the person image in the foreground image in the first characteristic image from the first characteristic image according to the first mask image, so that the electronic device can accurately determine the first characteristic region corresponding to the foreground image in the first characteristic image according to the third characteristic region and the fourth characteristic region corresponding to the foreground image in the first characteristic image.

Step 103a, the second electronic device acquires a second feature area except the first feature area from the first feature map.

In the embodiment of the present application, the second feature area is a feature area corresponding to the first background image in the first feature map.

In some embodiments of the present application, the second electronic device may set the feature value of each feature point in the first feature area to "0" first, and then randomly set the feature value of each feature point in the first feature area, so as to obtain the second feature area.

Step 103b, the second electronic device inputs the second feature region to the first model to obtain a second feature map output by the first model.

In the embodiment of the application, in order to avoid the situation that the characters and the objects in the foreground image of the first image exist in the second background image and further avoid the situation that the ghost image exists in the second image, the first electronic device can indicate the position of the characters in the foreground image in the first image to the second electronic device through the first mask image, so that the second electronic device can determine the first characteristic area corresponding to the foreground image in the first characteristic image according to the position, delete the first characteristic area from the first characteristic image, and therefore the second electronic device does not exist the characteristic area corresponding to the foreground image in the second characteristic image generated through the first model and further does not exist the characters and the objects in the foreground image in the second background image.

As can be seen from this, since the first electronic device may send the first mask image to the second electronic device, so that the second electronic device may determine the first feature region corresponding to the foreground image in the first feature image, and acquire the second feature region except for the first feature region from the first feature image, after the second electronic device inputs the second feature region into the first model, the second feature image that does not include the feature region corresponding to the foreground image may be obtained, that is, the person and the object in the foreground image do not exist in the second background image generated by the first electronic device, so that the occurrence of a ghost in the generated second image may be avoided; and, since the first feature area corresponding to the foreground image in the first feature map can be determined through the first electronic device and the second electronic device, rather than being determined only through the first electronic device, the time consumption for determining the first feature area can be reduced, and the time consumption for generating the second feature map can be reduced; in this way, the time consumed for generating the second image can be reduced while the quality of the second image required by the user generated by the first electronic device is further improved.

In the embodiment of the application, after the second electronic device receives the first mask image, the second electronic device may update the first mask image according to the feature area corresponding to the foreground image in the first feature map, and the updated first mask image indicates the position of the foreground image in the first image, so that the first electronic device may accurately fuse the foreground image and the second background image based on the updated first mask image, so as to obtain a second image required by the user, which will be illustrated below.

In some embodiments of the present application, the foreground image further includes a foreground image. Optionally, after the step 303, the image generating method provided by the embodiment of the present application may further include steps 401 to 404 described below, and the step 106 may be specifically implemented by the step 106a described below.

Step 401, the second electronic device determines a third mask image according to a fourth feature area corresponding to the foreground image in the first feature map.

In an embodiment of the present application, the third mask image indicates a position of the foreground object image in the first image.

It should be noted that, for the description of determining the fourth feature area for the second electronic device, reference may be made to the specific description in the foregoing embodiment, and the embodiment of the present application is not repeated herein.

In some embodiments of the present application, each pixel in the third mask image corresponds to one pixel in the first image, and in the third mask image, a pixel value of a pixel corresponding to a pixel in the foreground image may be "1", and a pixel value of a pixel corresponding to a pixel in the other image may be "0".

In some embodiments of the present application, the second electronic device may determine a position of a corresponding pixel in the first image according to a position of each feature point in the fourth feature area, and then determine a position of the foreground object image in the first image according to a position of an area where the corresponding pixel is located, and generate the third mask image according to the position.

Step 402, the second electronic device generates a second mask image according to the first mask image and the third mask image.

In an embodiment of the present application, the second mask image indicates a position of the foreground image in the first image.

In some embodiments of the application, the second electronic device may superimpose the first mask image and the second mask image to generate the second mask image.

Step 403, the second electronic device sends the second mask image to the first electronic device.

In the embodiment of the application, the second mask image is used for fusing the foreground image and the second background image by the first electronic device to obtain the second image.

Step 404, the first electronic device receives a second mask image from a second electronic device.

In an embodiment of the present application, the second mask image indicates a position of the foreground image in the first image, and the second mask image is determined by the second electronic device based on the first mask image.

And 106a, the first electronic device fuses the foreground image and the second background image based on the second mask image to obtain a second image.

In some embodiments of the present application, the first electronic device may first multiply the second mask image with the first image to obtain a third image, where the third image includes only the foreground image and has the same size as the first image, so that the first electronic device may directly superimpose the third image on the second background image to generate the second image.

Therefore, the second electronic device can also determine the third mask image indicating the position of the foreground image in the first image, and generate the position of the foreground image in the first image according to the third mask image and the first mask image, so that after the first electronic device fuses the foreground image and the second background image based on the second mask image, the position of the foreground image in the second image can be matched with the position of the foreground image in the first image, and therefore composition change of the second image caused by mismatching of the positions of the foreground image in the two images can be avoided, a user does not need to adjust the second image again, and the efficiency of generating the image required by the user by the first electronic device can be further improved.

According to the image generation method provided by the embodiment of the application, the execution subject can be an image generation device. In the embodiment of the present application, an image generating apparatus is described by taking an example in which the image generating apparatus executes an image generating method.

Fig. 5 shows a schematic diagram of one possible configuration of an image generating apparatus, which is a first image generating apparatus, according to an embodiment of the present application. As shown in fig. 5, the first image generating apparatus 50 provided by the embodiment of the present application may include: a transmitting module 51, a receiving module 52 and a processing module 53.

The sending module 51 is configured to send a first feature map corresponding to the first image to the second image generating device; the first image includes: a foreground image and a first background image; the first feature map is used by the second image generating means to generate a second feature map. A receiving module 52 for receiving the second feature map from the second image generating device. A processing module 53, configured to generate a second background image according to the second feature map received by the receiving module 52; the second background image and the first background image have different image contents; and generating a second image based on the foreground image and the second background image.

The embodiment of the application provides an image generating device, wherein a first image generating device can send a first characteristic image to a second image generating device with larger calculation force, so that after the second image generating device inputs the first characteristic image to a first model, the first model can complete the processing of the first characteristic image only by calculating for a short time to generate a second characteristic image, so that the first image generating device can quickly generate a second background image with different image content from a first background image according to the second characteristic image without calculating by the first image generating device to generate the second background image, therefore, the time consumption for generating the second background image can be reduced, and the first image generating device can quickly generate the second image required by a user based on the foreground image and the second background image; and, since the transmission between the first image generating device and the second image generating device is the first feature map and the second feature map, instead of the first image and the second background image, in the process of generating the second image required by the user, the situation that the first image or the second background image leaks in the process of generating the second image required by the user can be avoided; in this way, the efficiency of the first image generating apparatus to generate the image required by the user can be improved while avoiding disclosure of the privacy of the user.

In one possible implementation, the foreground image includes a person image. The processing module 53 is further configured to determine, based on the first image, a first mask image before the receiving module 52 receives the second feature map from the second image generating device; the first mask image indicates a position of the person image in the first image. The sending module 51 is further configured to send the first mask image determined by the processing module 53 to the second image generating device; the first mask image is used for determining a first characteristic region corresponding to the foreground image in the first characteristic map by the second image generating device. The second feature map is generated by the second image generating device according to a second feature area corresponding to the first background image in the first feature map, and the second feature area is determined by the second image generating device according to the first feature area.

In a possible implementation manner, the receiving module 52 is further configured to receive the second mask image from the second image generating device after the sending module 51 sends the first mask image to the second image generating device; the second mask image indicates a position of the foreground image in the first image, the second mask image being determined by the second image generating means based on the first mask image. The processing module 53 is specifically configured to fuse the foreground image and the second background image based on the second mask image received by the receiving module, so as to obtain a second image.

In a possible implementation manner, the processing module 53 is further configured to input the first image into the encoder of the first image generating device 50 before the sending module 51 sends the first feature map corresponding to the first image to the second image generating device, so as to obtain a first feature map output by the encoder; the first feature map is obtained by encoding the first image by an encoder.

In a possible implementation manner, the processing module 53 is specifically configured to input the second feature map into a decoder of the first image generating device 50, so as to obtain a second background image output by the decoder; the second background image is obtained by decoding the second feature map by a decoder.

The image generating device in the embodiment of the application can be an electronic device or a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. For example, the electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a mobile internet appliance (mobile internet device, MID), an Augmented Reality (AR)/Virtual Reality (VR) device, a robot, a wearable device, a super mobile personal computer (ultra-mobilepersonal computer, UMPC), a netbook or a personal digital assistant (personal digital assistant, PDA), or the like, and may also be a server, a network attached storage (network attached storage, NAS), a personal computer (personal computer, PC), a Television (TV), an teller machine, a self-service machine, or the like.

The image generating apparatus in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, and the embodiment of the present application is not limited specifically.

The image generating device provided by the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 1 to fig. 4, and in order to avoid repetition, a detailed description is omitted here.

Fig. 6 shows a schematic diagram of one possible configuration of an image generating apparatus, which is a second image generating apparatus, according to an embodiment of the present application. As shown in fig. 6, the second image generating apparatus 60 provided by the embodiment of the present application may include: a receiving module 61, a processing module 62 and a transmitting module 63.

Wherein, the receiving module 61 is configured to receive a first feature map corresponding to a first image from the first image generating device; the first image includes: a foreground image and a first background image. The processing module 62 is configured to input the first feature map received by the receiving module 61 to the first model, and obtain a second feature map output by the first model; the second feature map is generated by processing the first feature map by the first model. A transmitting module 63, configured to transmit the second feature map generated by the processing module 62 to the first image generating device; the second feature map is used for generating a second background image by the first image generating device, and generating a second image based on the foreground image and the second background image, wherein the second background image and the first background image are different in image content.

In one possible implementation, the foreground image includes a person image. The receiving module 61 is further configured to receive a first mask image from the first image generating device before the processing module 62 inputs the first feature map to the first model to obtain a second feature map output by the first model; the first mask image indicates a position of the person image in the first image. The processing module 62 is further configured to determine a first feature area corresponding to the foreground image in the first feature map based on the first mask image received by the receiving module 61. The processing module 62 is specifically configured to obtain, from the first feature map, a second feature region other than the first feature region, where the second feature region is a feature region corresponding to the first background image in the first feature map, and input the second feature region to the first model, so as to obtain a second feature map output by the first model.

In a possible implementation manner, the foreground image further includes a foreground object image. The processing module 62 is specifically configured to determine, from the first feature map, a third feature area corresponding to the person image in the first feature map according to the first mask image; and determining a fourth characteristic region corresponding to the third characteristic region and the foreground image in the first characteristic map as the first characteristic region.

In a possible implementation manner, the foreground image further includes a foreground object image. The processing module 62 is further configured to determine, after the receiving module 61 receives the first mask image from the first image generating device, a third mask image according to a fourth feature area corresponding to the foreground image in the first feature map; the third mask image indicates a position of the foreground object image in the first image; generating a second mask image according to the first mask image and the third mask image; the second mask image indicates the position of the foreground image in the first image. The transmitting module 63 is further configured to transmit the second mask image generated by the processing module 62 to the first image generating device; the second mask image is used for fusing the foreground image and the second background image by the first image generating device to obtain a second image.

In some implementations of the present application, as shown in fig. 7, an electronic device 70 is further provided in the embodiments of the present application, which includes a processor 71 and a memory 72, where the memory 72 stores a program or an instruction that can be executed on the processor 71, and the program or the instruction implements each process step of the embodiment of the image generating method when executed by the processor 71, and can achieve the same technical effect, so that repetition is avoided and no further description is given here.

The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.

Fig. 8 is a schematic hardware structure of an electronic device, which is a first electronic device, for implementing an embodiment of the present application.

The electronic device 100 includes, but is not limited to: radio frequency unit 101, network module 102, audio output unit 103, input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, and processor 110.

Those skilled in the art will appreciate that the electronic device 100 may further include a power source (e.g., a battery) for powering the various components, and that the power source may be logically coupled to the processor 110 via a power management system to perform functions such as managing charging, discharging, and power consumption via the power management system. The electronic device structure shown in fig. 8 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.

The radio frequency unit 101 is configured to send a first feature map corresponding to the first image to the second electronic device; the first image includes: a foreground image and a first background image; the first feature map is used for generating a second feature map by the second electronic equipment; receiving a second feature map from the second electronic device, and generating a second background image according to the second feature map; the second background image and the first background image have different image contents.

The processor 110 is configured to generate a second image based on the foreground image and the second background image.

The embodiment of the application provides electronic equipment, because a first electronic equipment can send a first characteristic image to a second electronic equipment with larger calculation force, after the second electronic equipment inputs the first characteristic image to a first model, the first model can complete the processing of the first characteristic image only by calculating for a short time so as to generate a second characteristic image, so that the first electronic equipment can quickly generate a second background image which is different from the image content of a first background image according to the second characteristic image without calculating by the first electronic equipment to generate the second background image, therefore, the time consumption for generating the second background image can be reduced, and the first electronic equipment can quickly generate a second image required by a user based on the foreground image and the second background image; and, because the transmission between the first electronic device and the second electronic device is the first feature map and the second feature map, but not the first image and the second background image, in the process of generating the second image required by the user, the situation that the first image or the second background image leaks in the process of generating the second image required by the user can be avoided; therefore, the efficiency of generating the image required by the user by the first electronic equipment can be improved under the condition of avoiding the leakage of the privacy of the user.

In some embodiments of the present application, the foreground image includes a person image.

The processor 110 is further configured to determine a first mask image from the first image; the first mask image indicates a position of the person image in the first image.

The radio frequency unit 101 is further configured to send the first mask image to the second electronic device; the first mask image is used for determining a first characteristic region corresponding to the foreground image in the first characteristic map by the second electronic device.

The second feature map is generated by the second electronic device according to a second feature area corresponding to the first background image in the first feature map, and the second feature area is determined by the second electronic device according to the first feature area.

In some embodiments of the application, the radio frequency unit 101 is further configured to receive a second mask image from a second electronic device; the second mask image indicates a position of the foreground image in the first image, the second mask image being determined by the second electronic device based on the first mask image.

The processor 110 is specifically configured to fuse the foreground image and the second background image based on the second mask image to obtain a second image.

In some embodiments of the present application, the processor 110 is further configured to input the first image into an encoder of the first electronic device, to obtain a first feature map output by the encoder; the first feature map is obtained by encoding the first image by an encoder.

In some embodiments of the present application, the processor 110 is specifically configured to input the second feature map into a decoder of the first electronic device, so as to obtain a second background image output by the decoder; the second background image is obtained by decoding the second feature map by a decoder.

It should be appreciated that in embodiments of the present application, the input unit 104 may include a graphics processor (graphics processing unit, GPU) 1041 and a microphone 1042, the graphics processor 1041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 106 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 107 includes at least one of a touch panel 1071 and other input devices 1072. The touch panel 1071 is also referred to as a touch screen. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.

Memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 109 may include volatile memory or nonvolatile memory, or the memory 109 may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (ddr SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DRRAM). Memory 109 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.

Processor 110 may include one or more processing units; optionally, the processor 110 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

Fig. 9 is a schematic hardware structure of a server implementing an embodiment of the present application.

The server can realize details of the image generation method corresponding to the second electronic device in the embodiments shown in fig. 1 to fig. 4, and achieve the same effect. As shown in fig. 9, the server 80 includes: a processor 81, a memory 82 and a computer program stored on the memory 82 and executable on the processor, the various components in the server 80 being coupled together by a bus system 83. It is understood that the bus system 83 is used to enable connected communications between these components. The computer program, when executed by the processor, implements the steps of the image generating method in the above embodiment, and achieves the same technical effects, and for avoiding repetition, will not be described herein.

The embodiment of the application also provides a readable storage medium, on which a program or an instruction is stored, which when executed by a processor, implements each process of the above-mentioned image generation method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.

Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.

The embodiment of the application further provides a chip, which comprises a processor and a communication interface, wherein the communication interface is coupled with the processor, and the processor is used for running programs or instructions to realize the processes of the embodiment of the image generation method, and can achieve the same technical effects, so that repetition is avoided, and the description is omitted here.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

Embodiments of the present application provide a computer program product stored in a storage medium, where the program product is executed by at least one processor to implement the respective processes of the above-described image generating method embodiments, and achieve the same technical effects, and for avoiding repetition, a detailed description is omitted herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims

1. An image generation method applied to a first electronic device, the method comprising:

transmitting a first feature map corresponding to the first image to the second electronic equipment; the first image includes: a foreground image and a first background image; the first feature map is used for generating a second feature map by the second electronic device;

receiving the second feature map from the second electronic device, and generating a second background image according to the second feature map; the second background image and the first background image have different image contents;

a second image is generated based on the foreground image and the second background image.

2. The method of claim 1, wherein the foreground image comprises a person image;

Before the receiving the second feature map from the second electronic device, the method further includes:

determining a first mask image from the first image; the first mask image indicates a position of the person image in the first image;

transmitting the first mask image to the second electronic device; the first mask image is used for the second electronic device to determine a first characteristic area corresponding to the foreground image in the first characteristic image;

the second feature map is generated by the second electronic device according to a second feature region corresponding to the first background image in the first feature map, and the second feature region is a feature region except the first feature region in the first feature map.

3. The method of claim 2, wherein after the transmitting the first mask image to the second electronic device, the method further comprises:

receiving a second mask image from the second electronic device; the second mask image indicating a position of the foreground image in the first image, the second mask image being determined by the second electronic device based on the first mask image;

The generating a second image based on the foreground image and the second background image includes:

and fusing the foreground image and the second background image based on the second mask image to obtain the second image.

4. The method of claim 1, wherein before the sending the first feature map corresponding to the first image to the second electronic device, the method further comprises:

inputting the first image into an encoder to obtain the first characteristic diagram output by the encoder; the first feature map is obtained by encoding the first image by the encoder.

5. The method of claim 1, wherein the generating a second background image from the second feature map comprises:

inputting the second feature map into a decoder of the first electronic device to obtain the second background image output by the decoder; the second background image is obtained by decoding the second feature image by the decoder.

6. An image generation method applied to a second electronic device, the method comprising:

receiving a first feature map corresponding to a first image from first electronic equipment; the first image includes: a foreground image and a first background image;

Inputting the first feature map to a first model to obtain a second feature map output by the first model; the second feature map is generated by processing the first feature map by the first model;

transmitting the second feature map to the first electronic device; the second feature map is used for generating a second background image by the first electronic device, and generating a second image based on the foreground image and the second background image; the second background image and the first background image have different image contents.

7. The method of claim 6, wherein the foreground image comprises a person image;

before the first feature map is input to a first model to obtain a second feature map output by the first model, the method further includes:

receiving a first mask image from the first electronic device; the first mask image indicates a position of the person image in the first image;

determining a first feature area corresponding to the foreground image in the first feature map based on the first mask image;

the step of inputting the first feature map to a first model to obtain a second feature map output by the first model comprises the following steps:

Acquiring a second characteristic region except the first characteristic region from the first characteristic map, wherein the second characteristic region is a characteristic region corresponding to the first background image in the first characteristic map;

and inputting the second characteristic region into the first model to obtain the second characteristic diagram output by the first model.

8. The method of claim 7, wherein the foreground image further comprises a foreground image;

the determining, based on the first mask image, a first feature region corresponding to the foreground image in the first feature map includes:

determining a third characteristic region corresponding to the character image in the first characteristic image from the first characteristic image according to the first mask image;

and determining a fourth characteristic region corresponding to the third characteristic region and the foreground image in the first characteristic map as the first characteristic region.

9. The method of claim 7, wherein the foreground image further comprises a foreground image;

after the receiving a first mask image from the first electronic device, the method further comprises:

Determining a third mask image according to a fourth characteristic region corresponding to the foreground object image in the first characteristic map; the third mask image indicates a position of the foreground object image in the first image;

generating a second mask image from the first mask image and the third mask image; the second mask image indicates a position of the foreground image in the first image;

transmitting the second mask image to the first electronic device; the second mask image is used for fusing the foreground image and the second background image by the first electronic device to obtain the second image.

10. An image generating apparatus, the image generating apparatus being a first image generating apparatus, characterized in that the first image generating apparatus includes: the device comprises a sending module, a receiving module and a processing module;

the sending module is used for sending a first feature map corresponding to the first image to the second image generating device; the first image includes: a foreground image and a first background image; the first feature map is used for generating a second feature map by the second image generating device;

the receiving module is used for receiving the second characteristic diagram from the second image generating device;

The processing module is used for generating a second background image according to the second feature image received by the receiving module; the second background image and the first background image have different image contents; and generating a second image based on the foreground image and the second background image.

11. An image generating apparatus, which is a second image generating apparatus, characterized in that the second image generating apparatus includes: the device comprises a receiving module, a processing module and a sending module;

the receiving module is used for receiving a first feature map corresponding to the first image from the first image generating device; the first image includes: a foreground image and a first background image;

the processing module is used for inputting the first feature map received by the receiving module into a first model to obtain a second feature map output by the first model; the second feature map is generated by processing the first feature map by the first model;

the sending module is used for sending the second feature map generated by the processing module to the first image generating device; the second feature map is used for generating a second background image by the first image generating device and generating a second image based on the foreground image and the second background image; the second background image and the first background image have different image contents.

12. An electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the image generation method of any one of claims 1 to 9.

13. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the image generation method according to any of claims 1 to 9.