CN114429420A

CN114429420A - Image generation method and device, readable medium and electronic equipment

Info

Publication number: CN114429420A
Application number: CN202210112948.0A
Authority: CN
Inventors: 牟永强; 庞昊洲; 闫耘
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2022-01-29
Filing date: 2022-01-29
Publication date: 2022-05-03
Anticipated expiration: 2042-01-29
Also published as: CN114429420B

Abstract

The disclosure relates to a method, a device, a readable medium and an electronic device for generating an image, which relate to the technical field of image processing, and the method comprises the following steps: and generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion image conforms to the style of the corresponding style image and has the same content as the content image. The similarity of each converted image to the content image is determined based on the histogram of the content image and the histogram of each converted image in the set of converted images. And taking the converted images with the similarity meeting the preset conditions in the converted image set as target images to obtain a target image set comprising at least one target image. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, the target images meeting the preset conditions are screened out, and the image generation efficiency and accuracy can be improved.

Description

Image generation method and device, readable medium and electronic equipment

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and an apparatus for generating an image, a readable medium, and an electronic device.

Background

With the continuous development of electronic information technology, various application programs appear in the application market to meet the diversified demands of users. Among them, in order to provide a more beautiful and high-quality picture for a user, a game application requires an art maker to make a large number of material images by a drawing tool. The production process of the material image is time-consuming, and has high requirements on the technical level and the aesthetic level of art production personnel. Therefore, the production of the material image usually consumes a large amount of manpower and material resources, the efficiency of obtaining the material image is usually low, and the quality is also very unstable.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, the present disclosure provides a method for generating an image, the method comprising:

generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion images are in accordance with the style of the corresponding style images and have the same content as the content images;

determining the similarity of each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set;

and taking the converted images with the similarity meeting preset conditions in the converted image set as target images to obtain a target image set comprising at least one target image.

In a second aspect, the present disclosure provides an apparatus for generating an image, the apparatus comprising:

the conversion module is used for generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion images are in accordance with the style of the corresponding style images and have the same content as the content images;

a determining module, configured to determine a similarity between each of the converted images and the content image according to the histogram of the content image and the histogram of each of the converted images in the converted image set;

and the screening module is used for taking the converted images with the similarity meeting the preset conditions in the converted image set as target images so as to obtain a target image set comprising at least one target image.

In a third aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which, when executed by a processing apparatus, performs the steps of the method of the first aspect of the present disclosure.

In a fourth aspect, the present disclosure provides an electronic device comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to implement the steps of the method of the first aspect of the present disclosure.

According to the technical scheme, the method comprises the steps of firstly generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images are in accordance with the styles of the corresponding style images and have the same content as the content images. And then determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set, and taking the converted image with the similarity meeting the preset condition as a target image to obtain the target image set. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, the target images meeting the preset conditions are screened out, and the image generation efficiency and accuracy can be improved.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale. In the drawings:

FIG. 1 is a flow chart illustrating a method of generating an image according to an exemplary embodiment;

FIG. 2 is a flow chart illustrating another method of generating an image in accordance with an exemplary embodiment;

FIG. 3 is a flow chart illustrating another method of generating an image in accordance with an exemplary embodiment;

FIG. 4 is a flow chart illustrating another method of generating an image in accordance with an exemplary embodiment;

FIG. 5 is a flow chart illustrating another method of generating an image in accordance with an exemplary embodiment;

FIG. 6 is a flow chart illustrating another method of generating an image in accordance with an exemplary embodiment;

FIG. 7 is a schematic diagram illustrating the structure of a transformation model in accordance with an exemplary embodiment;

FIG. 8 is a flow diagram illustrating training a recognition model in accordance with an exemplary embodiment;

FIG. 9 is a block diagram illustrating an apparatus for generating an image in accordance with an exemplary embodiment;

FIG. 10 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment;

FIG. 11 is a block diagram illustrating another apparatus for generating an image in accordance with an exemplary embodiment;

FIG. 12 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment;

FIG. 13 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment;

FIG. 14 is a block diagram illustrating an electronic device in accordance with an example embodiment.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

FIG. 1 is a flow chart illustrating a method of generating an image, as shown in FIG. 1, including the steps of:

step 101, generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion image conforms to the style of the corresponding style image and has the same content as the content image.

For example, a content image and a genre image set may be prepared in advance, wherein the content image may be an aesthetically pleasing image produced by a fine arts producer, and the number of the content images is small (for example, may be 5). The style image set comprises a plurality of style images, the style images can be any images, can be images produced by art makers, can also be real photos shot by an image acquisition device, and can also be images acquired from a network, and the number of the style images is large (for example, 1000). Then, a conversion image set can be generated according to the content image and the style image set, wherein the conversion image set comprises a plurality of conversion images, each conversion image corresponds to one style image, namely the conversion image conforms to the style of the corresponding style image, and the content of the conversion image is the same as that of the content image. The conversion image may be an image obtained by combining the content of the content image and the style of the corresponding style image, or may be an image obtained by performing style transition on the content image according to the style of the corresponding style image. Specifically, a style conversion model may be trained in advance, and then an image pair is formed by the content image and a style image, and the image pair is used as an input of the style conversion model to obtain a converted image output by the style conversion model, where the style image corresponds to the converted image. And repeating the processes until the conversion image corresponding to each style image is obtained, thus obtaining the conversion image set.

Step 102, determining the similarity between each conversion image and the content image according to the histogram of the content image and the histogram of each conversion image in the conversion image set.

For example, in general, an image that is aesthetically pleasing usually has characteristics such as uniform color tone, clear gradation, and clear shading, and therefore, in order to determine whether or not a converted image in a converted image set is aesthetically pleasing, a content image created by an art creator may be used as a standard, and each converted image may be sequentially compared with the content image. Specifically, the histogram of the content image and the histogram of each converted image may be acquired separately. The histogram may be a gray level histogram or a color histogram in a different color space, for example, an RGB (Red Green Blue) color space, an HSV (Hue Saturation Value) color space, a CMYK (Cyan Magenta Yellow) color space, or the like may be selected, which is not particularly limited in the present disclosure. Taking a color histogram as an example, the color histogram is a color feature of an image and can describe the proportion of different colors in the image. For example, the color histogram of the content image and the color histogram of each converted image may be acquired by a calcHist function provided in OpenCV. As another example, taking a gray histogram as an example, the gray histogram is a gray feature of an image and can describe the distribution of gray levels of the image.

Then, the similarity of each converted image and the content image can be determined according to the histogram of the content image and the histogram of each converted image. The similarity can characterize how similar the histogram of each converted image resembles the trend of the histogram of the content image. Specifically, the similarity may be understood as a correlation coefficient between a histogram of the converted image and a histogram of the content image, or may be understood as a covariance distance between the histogram of the converted image and the histogram of the content image.

And 103, taking the converted images with the similarity meeting the preset conditions in the converted image set as target images to obtain a target image set comprising at least one target image.

For example, the conversion images in the conversion image set that satisfy the preset condition may be screened out according to the similarity between each conversion image and the content image. Specifically, the preset condition may be that the similarity is greater than or equal to a preset similarity threshold, where the similarity threshold may be set according to actual requirements. The preset condition may be that the converted images in the converted image set are arranged in descending order according to the similarity, and a specified number (for example, 50) of converted images are arranged at the top. The preset condition may also be that the similarity is within a preset similarity range, wherein the similarity range may be set according to actual requirements. Then, the converted image meeting the preset condition may be used as a target image, and the target image is placed in the target image set, that is, the target image set including at least one target image may be obtained.

In summary, the present disclosure first generates a converted image set according to the acquired content image and a plurality of preset genre images, where the converted image set includes a converted image corresponding to each genre image, and the converted images conform to the genre of the corresponding genre images and have the same content as the content images. And then determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set, and taking the converted image with the similarity meeting the preset condition as a target image to obtain the target image set. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, the target images meeting the preset conditions are screened out, and the image generation efficiency and accuracy can be improved.

Fig. 2 is a flow chart illustrating another method of generating an image according to an exemplary embodiment, as shown in fig. 2, after step 103, the method may further include:

and 104, determining the conversion effect type of each target image through a pre-trained recognition model based on the target image set, and taking the target image with the conversion effect type of the specified type as a result image.

For example, each target image in the target image set may be used as an input, and the target image may be classified by using a pre-trained recognition model to determine a conversion effect type to which the target image belongs. The recognition model can be understood as a classifier for predicting the type of conversion effect to which the image belongs. For example, the recognition model may extract the image features of the target image, and then determine the matching degree between the image features and each of the plurality of pre-specified conversion effect types, that is, the higher the matching degree between the image features and the conversion effect type is, the higher the probability value that the target image belongs to the conversion effect type is, the lower the matching degree is, and the lower the probability value that the target image belongs to the conversion effect type is. The recognition model determines that the target image belongs to the conversion effect type with the highest matching degree, and may also determine that the target image belongs to the conversion effect type with the corresponding matching degree satisfying a preset condition (for example, a preset number of matching degrees arranged at the top in a descending order). The recognition model may be, for example, CNN (Convolutional Neural Networks, chinese) or LSTM (Long Short-Term Memory network, chinese), or MLP (multi layer Perceptron Head, chinese), and the disclosure is not limited thereto. Finally, the target image with the conversion effect type of the specified type can be used as a result image, and the result image can be one or a plurality of images.

For example, the pre-specified conversion effect types may include: the recognition model can be understood as a multi-classification network, and accordingly, the specified types may be: the specified type can also be a high quality type, and the specified type can also be a low quality type. The pre-specified conversion effect type may also include two types, namely, a high quality type and a low quality type, and the recognition model may be understood as a two-class network. In this way, on the basis of obtaining the target image set, the result image with the conversion effect type as the specified type can be screened out.

Fig. 3 is a flow chart illustrating another image generation method according to an exemplary embodiment, and as shown in fig. 3, the implementation of step 102 may include:

step 1021, respectively determining a content image and histograms of each of the converted images on a plurality of color channels of a preset color space.

Step 1022, for each converted image, determining a covariance distance between the content image and the converted image on each color channel according to the histogram of the content image on each color channel and the histogram of the converted image on each color channel.

Step 1023, according to the covariance distance of the content image and the conversion image on each color channel, determining the similarity of the content image and the conversion image.

For example, to determine the similarity between each converted image and the content image, a predetermined color space may be selected, i.e., a plurality of color channels corresponding to the color space are determined. A histogram of the content image over a plurality of color channels is then determined, and a histogram of each converted image over the plurality of color channels is determined. Taking the color space as RGB as an example, the corresponding color channels are: red, green, and blue channels. A histogram of the content image on the red channel (which may be expressed as

) Histogram on green channel (which may be expressed as

) And histograms on the blue channel (which may be expressed as

) At the same time, the histogram of the converted image on the red channel (which can be expressed as

) Histogram on green channel (which may be expressed as

) And histograms on the blue channel (which may be expressed as

)。

Then, for each converted image, a covariance distance of the content image and the converted image on each color channel may be determined according to the histogram of the content image on each color channel and the histogram of the converted image on each color channel. Taking the red channel as an example, the covariance distance of the content image and the converted image on the red channel can be determined by the following equation:

wherein the content of the first and second substances,

representing the covariance distance of the content image and any transformed image on the red channel,

represents the mean of the histogram of the content image on the red channel,

represents the standard deviation of the histogram of the content image on the red channel,

representing the mean of the histogram of the converted image on the red channel,

representing the standard deviation of the histogram of the transformed image on the red channel.

Finally, the similarity of the content image and the conversion image can be determined according to the covariance distance of the content image and the conversion image on each color channel. Specifically, the maximum covariance distance between the content image and the converted image may be used as the similarity between the content image and the converted image. For example, taking the color space as RGB as an example, for each conversion image, the covariance distances of the content image and the conversion image on the red channel, the green channel and the blue channel can be obtained through step 1022, and can be respectively expressed as:

the similarity Coe can be determined by the following formula:

FIG. 4 is a flow chart illustrating another method of generating an image according to an exemplary embodiment, as shown in FIG. 4, step 103 may include:

and step 1031, regarding each conversion image, if the similarity between the conversion image and the content image is greater than a preset similarity threshold, taking the conversion image as a target image, and adding the target image to the target image set. Alternatively, the first and second electrodes may be,

and 1032, arranging each conversion image in the conversion image set in a descending order according to the corresponding similarity, taking the specified number of conversion images with the top arrangement order as target images, and adding the target images to the target image set.

For example, the method for screening out the target image from the converted image set may include two ways: one is to set a similarity threshold in advance, and then to take the converted image with the similarity greater than the preset similarity threshold as the target image and put the target image into the target image set. And the other method is that all the converted images are arranged in a descending order according to the corresponding similarity, then a specified number of converted images with the top arrangement order are taken as target images, and the target images are put into a target image set. The specified number may be, for example, 100.

FIG. 5 is a flow chart illustrating another method of generating an image, according to an exemplary embodiment, as shown in FIG. 5, step 104 may be implemented by:

step 1041, inputting each target image into the recognition model to obtain a conversion effect type of the target image output by the recognition model, where the conversion effect type includes: a high quality type and a low quality type.

In step 1042, if the conversion effect type of the target image is a high quality type, the target image is used as a result image.

For example, the pre-specified transformation effect type may include both a high quality type and a low quality type, and accordingly, the recognition model may be a binary network. Each target image is input into a recognition model, and the output result of the recognition model has two possibilities, namely a high-quality type or a low-quality type, and for example, the high-quality type can be represented by 1, and the low-quality type can be represented by 0. If the conversion effect type of the target image is a high-quality type, the target image can be determined to be a result image, and the aesthetic feeling is satisfied.

Fig. 6 is a flow chart illustrating another method of generating an image according to an exemplary embodiment, as shown in fig. 6, step 101 may include:

step 1011, for each style image, inputting the content image and the style image into a coding layer in a conversion model trained in advance to obtain the content image feature corresponding to the content image and the style image feature corresponding to the style image.

Step 1012, inputting the content image characteristics and the style image characteristics into a conversion layer in the conversion model to obtain conversion image characteristics.

Step 1013, inputting the feature of the converted image into a decoding layer in the conversion model to obtain a converted image corresponding to the style image.

For example, the set of transformation images may be obtained by a pre-trained transformation model, and the structure of the transformation model may be as shown in fig. 7, including: the system comprises an encoding layer, a conversion layer and a decoding layer, wherein the encoding layer can be composed of a pre-trained Network, for example, models such as VGG (Visual Geometry Group Network) and Resnet can be used for encoding input content images and style images to obtain content image features corresponding to the content images and style image features corresponding to the style images.

The conversion layer can be a WCT (English: Whiten and Color Transform) module and is used for whitening and Color converting the content image characteristics and the style image characteristics output by the coding layer to obtain converted image characteristics. And finally, generating a converted image according to the characteristics of the converted image by a decoding layer, wherein the decoding layer can be a reverse encoding layer, namely the decoding layer and the encoding layer are in a mirror image relationship.

In particular, the content image may be represented as I_cThe stylistic image may be represented as I_sAfter passing through the coding layer, the obtained content image feature can be represented as f_cThe stylistic image characteristic may be expressed as f_sConversion layer will f_cWhitening and color conversion to obtain

Further, the conversion matrix of the style image is reused to obtain the characteristics of the converted image

Finally, the decoding layer is based on

Generating a converted image I_cs. Further, when training the transformation model, the total loss of the transformation model may include two parts: content loss (which can be understood as loss at the pixel level) and semantic loss (which can be understood as loss in the style dimension). The total loss can be determined, for example, by the following equation:

L＝L_c+λL_s

L_c＝||I_c-I_cs||²

L_s＝||φ(I_c)-φ(I_cs)||²

wherein L represents the total loss, L_cRepresents a content loss, L_sIndicating a semantic loss and lambda a preset weight. Phi (I)_c) Semantic features, phi (I), representing images of content_cs) Representing semantic features of the transformed image.

FIG. 8 is a flowchart illustrating a method for training a recognition model according to an exemplary embodiment, where the recognition model is trained as shown in FIG. 8 by:

and step A, generating a sample conversion image set according to the sample image and the plurality of style images, wherein the sample conversion image set comprises a sample conversion image corresponding to each style image, and the sample conversion image conforms to the style of the corresponding style image and has the same content as the sample image.

And step B, taking the sample conversion image set as the input of the recognition model, and taking the actual effect type corresponding to each sample conversion image as the output of the recognition model so as to train the recognition model.

For example, when training a recognition model, a sample input set needs to be obtained first. The sample input set comprises a plurality of sample inputs, the sample input may be one of a sample conversion image set, the sample output set comprises a sample output corresponding to each sample input, each sample output comprises an actual effect type corresponding to the corresponding sample conversion image, wherein the actual effect type may be a high quality type or a low quality type, for example. The acquisition of the sample converted image set can be obtained according to the sample image and a plurality of style images, wherein the sample image can be an aesthetic image produced by an art producer. For example, the sample image and a style image may be combined into an image pair, and the image pair is used as an input of the style conversion model to obtain a style conversion image output by the style conversion model, and the style image corresponds to the sample conversion image. And repeating the process until a sample conversion image corresponding to each style image is obtained, namely obtaining a sample conversion image set, wherein the sample conversion image corresponding to each style image is included, the sample conversion image conforms to the style of the corresponding style image, and the content of the sample conversion image is the same as that of the sample image.

In training the recognition model, the recognition model may be trained by using a sample input set (i.e., a sample conversion image set) as an input of the recognition model and a sample output set as an output of the recognition model, so that when the sample input set is input, the output of the recognition model can be matched with the sample output set. For example, the loss may be determined from the output of the recognition model and the sample output set, and the back propagation algorithm may be used to modify the neuron parameters in the recognition model, such as Weight (in English) and Bias (in English) of the neuron, with the goal of reducing the loss. And repeating the steps until the loss meets a preset condition, for example, the loss is less than a preset loss threshold value, so as to achieve the purpose of training the recognition model.

Fig. 9 is a block diagram illustrating an image generating apparatus according to an exemplary embodiment, and as shown in fig. 9, the apparatus 200 includes:

the conversion module 201 is configured to generate a conversion image set according to the acquired content image and a plurality of preset genre images, where the conversion image set includes a conversion image corresponding to each genre image, and the conversion image conforms to the genre of the corresponding genre image and has the same content as the content image.

A determining module 202, configured to determine a similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set.

And the screening module 203 is configured to use the converted images in the converted image set, of which the similarity satisfies a preset condition, as target images to obtain a target image set including at least one target image.

Fig. 10 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment, and as shown in fig. 10, the apparatus 200 may further include:

and the identification module 204 is configured to determine a conversion effect type of each target image through a pre-trained identification model based on the target image set after the conversion images in the conversion image set with the similarity meeting the preset condition are used as the target images, and use the target images with the conversion effect types being the designated types as result images.

Fig. 11 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment, and as shown in fig. 11, the determining module 202 may include:

the histogram determining sub-module 2021 is configured to determine histograms of the content image and each of the converted images on a plurality of color channels of the color space in a preset color space, respectively.

The covariance distance determination sub-module 2022 is configured to determine, for each of the converted images, a covariance distance between the content image and the converted image in each of the color channels according to the histogram of the content image in each of the color channels and the histogram of the converted image in each of the color channels.

The similarity determining sub-module 2023 is configured to determine the similarity between the content image and the converted image according to the covariance distance between the content image and the converted image in each color channel.

In an application scenario, the similarity determination sub-module 2023 may be configured to:

and taking the maximum covariance distance between the content image and the conversion image as the similarity between the content image and the conversion image.

In one application scenario, the filtering module 203 may be configured to:

and regarding each conversion image, if the similarity between the conversion image and the content image is greater than a preset similarity threshold value, taking the conversion image as a target image, and adding the target image to a target image set. Alternatively, the first and second electrodes may be,

and arranging each conversion image in the conversion image set in a descending order according to the corresponding similarity, taking a specified number of conversion images with the top arrangement order as target images, and adding the target images to the target image set.

Fig. 12 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment, and as shown in fig. 12, the recognition module 204 may include:

the identifying submodule 2041 is configured to, for each target image, input the target image into the recognition model to obtain a conversion effect type of the target image output by the recognition model, where the conversion effect type includes: a high quality type and a low quality type.

The processing sub-module 2042 is configured to take the target image as a result image if the conversion effect type of the target image is a high quality type.

Fig. 13 is a block diagram illustrating another image generation apparatus according to an exemplary embodiment, and as shown in fig. 13, the conversion module 201 may include:

the encoding sub-module 2011 is configured to, for each style image, input the content image and the style image into an encoding layer in a conversion model trained in advance, so as to obtain a content image feature corresponding to the content image and a style image feature corresponding to the style image.

The conversion sub-module 2012 is configured to input the content image features and the genre image features into a conversion layer in the conversion model to obtain converted image features.

And the decoding submodule 2013 is used for inputting the characteristics of the converted image into a decoding layer in the conversion model so as to obtain a converted image corresponding to the style image.

In one application scenario, the recognition model is trained as follows:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

In summary, the present disclosure first generates a conversion image set according to the acquired content image and a plurality of preset style images, where the conversion image set includes a conversion image corresponding to each style image, and the conversion image conforms to the style of the corresponding style image and has the same content as the content image. And then determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set, and taking the converted image with the similarity meeting the preset condition as a target image to obtain the target image set. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, the target images meeting the preset conditions are screened out, and the image generation efficiency and accuracy can be improved.

Referring now to fig. 14, a schematic structural diagram of an electronic device (e.g., may be an execution subject of the disclosed embodiments, and may be a terminal device or a server) 300 suitable for implementing the disclosed embodiments is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 14 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 14, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.

Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device 300 to communicate wirelessly or by wire with other devices to exchange data. While fig. 14 illustrates an electronic device 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 309, or installed from the storage means 308, or installed from the ROM 302. The computer program, when executed by the processing device 301, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the terminal devices, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion images are in accordance with the style of the corresponding style images and have the same content as the content images; determining the similarity of each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set; and taking the converted images with the similarity meeting preset conditions in the converted image set as target images to obtain a target image set comprising at least one target image.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a module does not in some cases constitute a limitation on the module itself, for example, a conversion module may also be described as a "module that generates a set of converted images".

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Example 1 provides a method of generating an image, including: generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion images are in accordance with the style of the corresponding style images and have the same content as the content images; determining the similarity of each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set; and taking the converted images with the similarity meeting preset conditions in the converted image set as target images to obtain a target image set comprising at least one target image.

Example 2 provides the method of example 1, in accordance with one or more embodiments of the present disclosure, after the converting images of which the similarity satisfies a preset condition in the converting image set are taken as target images, the method further including: and determining the conversion effect type of each target image through a pre-trained recognition model based on the target image set, and taking the target image with the conversion effect type of a specified type as a result image.

Example 3 provides the method of example 1, the determining a similarity of each of the converted images to the content image from the histogram of the content image and the histogram of each of the converted images in the set of converted images, including: respectively determining histograms of the content image and each conversion image on a plurality of color channels of a preset color space; for each conversion image, determining a covariance distance between the content image and the conversion image on each color channel according to the histogram of the content image on each color channel and the histogram of the conversion image on each color channel; and determining the similarity between the content image and the conversion image according to the covariance distance between the content image and the conversion image on each color channel.

Example 4 provides the method of example 3, the determining a similarity of the content image and the conversion image according to a covariance distance of the content image and the conversion image on each color channel, including: and taking the maximum covariance distance between the content image and the conversion image as the similarity of the content image and the conversion image.

Example 5 provides the method of example 1, wherein the taking, as a target image, the converted image of which similarity satisfies a preset condition in the converted image set to obtain a target image set including at least one target image, includes: for each conversion image, if the similarity between the conversion image and the content image is greater than a preset similarity threshold, taking the conversion image as the target image, and adding the target image to the target image set; or, arranging each conversion image in the conversion image set in a descending order according to the corresponding similarity, taking a specified number of conversion images with the top arrangement order as the target images, and adding the target images to the target image set.

Example 6 provides the method of example 2, wherein determining, based on the target image set, a conversion effect type of each target image through a pre-trained recognition model, and taking the target image with the conversion effect type of a specified type as a result image, includes: for each target image, inputting the target image into the recognition model to obtain a conversion effect type of the target image output by the recognition model, where the conversion effect type includes: a high quality type and a low quality type; and if the conversion effect type of the target image is a high-quality type, taking the target image as the result image.

Example 7 provides the method of example 1, the generating a set of transformed images from the acquired content image, and a preset plurality of genre images, including: inputting the content images and the style images into a coding layer in a conversion model trained in advance aiming at each style image so as to obtain the content image characteristics corresponding to the content images and the style image characteristics corresponding to the style images; inputting the content image characteristics and the style image characteristics into a conversion layer in a conversion model to obtain conversion image characteristics; inputting the feature of the converted image into a decoding layer in the conversion model to obtain the converted image corresponding to the style image.

Example 8 provides the method of example 2, the recognition model being trained by: generating a sample conversion image set according to a sample image and a plurality of style images, wherein the sample conversion image set comprises a sample conversion image corresponding to each style image, and the sample conversion image conforms to the style of the corresponding style image and has the same content as the sample image; and taking the sample conversion image set as the input of the recognition model, and taking the actual effect type corresponding to each sample conversion image as the output of the recognition model so as to train the recognition model.

According to one or more embodiments of the present disclosure, example 9 provides an apparatus for generating an image, including: the conversion module is used for generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises a conversion image corresponding to each style image, and the conversion images are in accordance with the style of the corresponding style images and have the same content as the content images; a determining module, configured to determine a similarity between each of the converted images and the content image according to the histogram of the content image and the histogram of each of the converted images in the converted image set; and the screening module is used for taking the converted images with the similarity meeting the preset conditions in the converted image set as target images so as to obtain a target image set comprising at least one target image.

Example 10 provides a computer-readable medium having stored thereon a computer program that, when executed by a processing device, implements the steps of the methods of examples 1-8, in accordance with one or more embodiments of the present disclosure.

Example 11 provides, in accordance with one or more embodiments of the present disclosure, an electronic device, comprising: a storage device having a computer program stored thereon; processing means for executing the computer program in the storage means to implement the steps of the methods of examples 1 to 8.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Claims

1. A method of generating an image, the method comprising:

2. The method according to claim 1, wherein after said collecting the converted images, the converted images whose similarity satisfies a preset condition are taken as target images, the method further comprises:

and determining the conversion effect type of each target image through a pre-trained recognition model based on the target image set, and taking the target image with the conversion effect type of a specified type as a result image.

3. The method of claim 1, wherein determining the similarity of each of the transformed images to the content image based on the histogram of the content image and the histogram of each of the transformed images in the set of transformed images comprises:

respectively determining histograms of the content image and each conversion image on a plurality of color channels of a preset color space;

for each conversion image, determining a covariance distance between the content image and the conversion image on each color channel according to the histogram of the content image on each color channel and the histogram of the conversion image on each color channel;

and determining the similarity between the content image and the conversion image according to the covariance distance between the content image and the conversion image on each color channel.

4. The method of claim 3, wherein determining the similarity between the content image and the converted image according to the covariance distance between the content image and the converted image in each color channel comprises:

and taking the maximum covariance distance between the content image and the conversion image as the similarity of the content image and the conversion image.

5. The method according to claim 1, wherein the collecting the converted images with similarity satisfying a preset condition as a target image comprises:

for each conversion image, if the similarity between the conversion image and the content image is greater than a preset similarity threshold, taking the conversion image as the target image, and adding the target image to the target image set; alternatively, the first and second electrodes may be,

and arranging each conversion image in the conversion image set in a descending order according to the corresponding similarity, taking the specified number of conversion images with the top arrangement order as the target images, and adding the target images to the target image set.

6. The method according to claim 2, wherein the determining a conversion effect type of each target image based on the target image set through a pre-trained recognition model and using the target image with the conversion effect type of a specified type as a result image comprises:

for each target image, inputting the target image into the recognition model to obtain a conversion effect type of the target image output by the recognition model, where the conversion effect type includes: a high quality type and a low quality type;

and if the conversion effect type of the target image is a high-quality type, taking the target image as the result image.

7. The method of claim 1, wherein generating a set of transformed images from the captured content image and a predetermined plurality of stylistic images comprises:

inputting the content images and the style images into a coding layer in a conversion model trained in advance aiming at each style image so as to obtain the content image characteristics corresponding to the content images and the style image characteristics corresponding to the style images;

inputting the content image characteristics and the style image characteristics into a conversion layer in a conversion model to obtain conversion image characteristics;

inputting the feature of the converted image into a decoding layer in the conversion model to obtain the converted image corresponding to the style image.

8. The method of claim 2, wherein the recognition model is trained by:

generating a sample conversion image set according to a sample image and a plurality of style images, wherein the sample conversion image set comprises a sample conversion image corresponding to each style image, and the sample conversion image conforms to the style of the corresponding style image and has the same content as the sample image;

and taking the sample conversion image set as the input of the recognition model, and taking the actual effect type corresponding to each sample conversion image as the output of the recognition model so as to train the recognition model.

9. An apparatus for generating an image, the apparatus comprising:

10. A computer-readable medium, on which a computer program is stored, characterized in that the program, when being executed by processing means, carries out the steps of the method of any one of claims 1 to 8.

11. An electronic device, comprising:

a storage device having a computer program stored thereon;

processing means for executing the computer program in the storage means to carry out the steps of the method according to any one of claims 1 to 8.