CN114429420B

CN114429420B - Image generation method and device, readable medium and electronic equipment

Info

Publication number: CN114429420B
Application number: CN202210112948.0A
Authority: CN
Inventors: 牟永强; 庞昊洲; 闫耘
Original assignee: Douyin Vision Co Ltd
Current assignee: Douyin Vision Co Ltd
Priority date: 2022-01-29
Filing date: 2022-01-29
Publication date: 2023-11-28
Anticipated expiration: 2042-01-29
Also published as: CN114429420A

Abstract

The disclosure relates to an image generation method, an image generation device, a readable medium and an electronic device, and relates to the technical field of image processing, wherein the method comprises the following steps: and generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are identical to the content of the content image. And determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set. And taking the converted images with the similarity meeting the preset condition in the converted image set as target images to obtain a target image set comprising at least one target image. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, and target images meeting preset conditions are screened out of the conversion images, so that the efficiency and the accuracy of image generation can be improved.

Description

Image generation method and device, readable medium and electronic equipment

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and apparatus for generating an image, a readable medium, and an electronic device.

Background

With the continuous development of electronic information technology, various application programs are presented in the application market to meet the diversified demands of users. Among them, in order for a game application to provide a user with a more beautiful high-quality picture, an art maker is required to make a large number of material images through a drawing tool. The process of producing the material image is extremely time-consuming and requires a high level of skill and aesthetic appeal for the art producer. Therefore, the production of the material image tends to consume a great deal of manpower and material resources, the efficiency of acquiring the material image is low, and the quality is unstable.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In a first aspect, the present disclosure provides a method for generating an image, the method comprising:

generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are identical to the content of the content image;

Determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set;

and taking the converted images with the similarity meeting the preset condition in the converted image set as target images to obtain a target image set comprising at least one target image.

In a second aspect, the present disclosure provides an image generation apparatus, the apparatus comprising:

the conversion module is used for generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are the same as the content of the content image;

a determining module, configured to determine a similarity between each of the converted images and the content image according to the histogram of the content image and the histogram of each of the converted images in the converted image set;

and the screening module is used for taking the converted images with the similarity meeting the preset condition in the converted image set as target images so as to obtain a target image set comprising at least one target image.

In a third aspect, the present disclosure provides a computer readable medium having stored thereon a computer program which when executed by a processing device performs the steps of the method of the first aspect of the present disclosure.

In a fourth aspect, the present disclosure provides an electronic device comprising:

a storage device having a computer program stored thereon;

processing means for executing said computer program in said storage means to carry out the steps of the method of the first aspect of the disclosure.

Through the technical scheme, the method and the device for generating the converted image set according to the acquired content image and the preset multiple style images firstly generate the converted image set, wherein the converted image set comprises the converted image corresponding to each style image, and the converted image accords with the style of the corresponding style image and is identical to the content of the content image. And then determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set, and taking the converted image with the similarity meeting the preset condition as a target image to obtain the target image set. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, and target images meeting preset conditions are screened out of the conversion images, so that the efficiency and the accuracy of image generation can be improved.

Additional features and advantages of the present disclosure will be set forth in the detailed description which follows.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale. In the drawings:

FIG. 1 is a flowchart illustrating a method of generating an image according to an exemplary embodiment;

FIG. 2 is a flowchart illustrating another method of generating an image, according to an example embodiment;

FIG. 3 is a flowchart illustrating another method of generating an image, according to an exemplary embodiment;

FIG. 4 is a flowchart illustrating another method of generating an image, according to an example embodiment;

FIG. 5 is a flowchart illustrating another method of generating an image, according to an exemplary embodiment;

FIG. 6 is a flowchart illustrating another method of generating an image, according to an example embodiment;

FIG. 7 is a schematic diagram of a transformation model, according to an example embodiment;

FIG. 8 is a flowchart illustrating a training recognition model, according to an exemplary embodiment;

FIG. 9 is a block diagram of an image generation apparatus according to an exemplary embodiment;

FIG. 10 is a block diagram of another image generation apparatus according to an exemplary embodiment;

FIG. 11 is a block diagram of another image generation apparatus according to an exemplary embodiment;

FIG. 12 is a block diagram of another image generation apparatus according to an exemplary embodiment;

FIG. 13 is a block diagram of another image generation apparatus according to an exemplary embodiment;

fig. 14 is a block diagram of an electronic device, according to an example embodiment.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

Fig. 1 is a flowchart illustrating a method of generating an image according to an exemplary embodiment, as shown in fig. 1, the method including the steps of:

step 101, generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are identical to the content of the content image.

For example, a content image and a style image set may be prepared in advance, wherein the content image may be an aesthetically pleasing image produced by an artist, and the number of content images is small (for example, may be 5). The style image set includes a plurality of style images, and the style images may be any images, may be images produced by an art producer, may be real photographs taken by an image acquisition device, may be images acquired from a network, and the number of style images is large (for example, may be 1000). Thereafter, a set of converted images may be generated from the content image and the set of style images, wherein the set of converted images includes a plurality of converted images, each corresponding to one of the style images, i.e., the converted image conforms to the style of the corresponding style image, and the content of the converted image is the same as the content of the content image. It is understood that the converted image combines the content of the content image and the style of the corresponding style image, and that the converted image may be understood as an image obtained by performing style migration on the content image according to the style of the corresponding style image. Specifically, a style conversion model may be trained in advance, and then a content image and a style image are formed into an image pair, and the image pair is used as an input of the style conversion model, so as to obtain a converted image corresponding to the style image output by the style conversion model. Repeating the above process until the converted image corresponding to each style image is obtained, and obtaining the converted image set.

Step 102, determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set.

For example, in general, an image conforming to the aesthetic sense has characteristics of uniform color tone, obvious layering, clear light-dark relation, and the like, and therefore, in order to determine whether the converted images in the converted image set conform to the aesthetic sense, each converted image may be compared with the content image in turn by taking the content image produced by the artist maker as a standard. Specifically, the histogram of the content image and the histogram of each converted image may be acquired separately first. The histogram may be a gray level histogram, or may be a color histogram in a different color space, for example, RGB (english: red Green Blue) color space, HSV (english: hue Saturation Value) color space, CMYK (english: cyan Magenta Yellow) color space, etc., which is not particularly limited in this disclosure. Taking a color histogram as an example, the color histogram is a color feature of an image, and can describe the proportion of different colors in the image. For example, the color histogram of the content image and the color histogram of each conversion image may be acquired by a calcHist function provided in OpenCV. For another example, taking a gray histogram as an example, the gray histogram is a gray feature of an image, and can describe a distribution of gray levels of the image.

Thereafter, the similarity of each converted image to the content image may be determined from the histogram of the content image and the histogram of each converted image. The similarity can characterize the degree of similarity of the histogram of each converted image to the trend of the histogram of the content image. Specifically, the similarity may be understood as a correlation coefficient of the histogram of the converted image and the histogram of the content image, or the similarity may be understood as a covariance distance of the histogram of the converted image and the histogram of the content image.

And 103, taking the converted images with the similarity meeting the preset condition in the converted image set as target images to obtain a target image set comprising at least one target image.

For example, the converted images satisfying the preset condition in the converted image set may be screened out according to the similarity between each converted image and the content image. Specifically, the preset condition may be that the similarity is greater than or equal to a preset similarity threshold, where the similarity threshold may be set according to actual requirements. The preset condition may be that the converted images in the converted image set are arranged in descending order according to the similarity, and the designated number (for example, 50) of the converted images are arranged at the forefront. The preset condition may be that the similarity is within a preset similarity range, where the similarity range may be set according to actual requirements. Then, the converted image meeting the preset condition can be used as a target image, and the target image is put into a target image set, so that the target image set comprising at least one target image can be obtained.

In summary, the present disclosure first generates a conversion image set according to an acquired content image and a plurality of preset style images, where the conversion image set includes a conversion image corresponding to each style image, and the conversion image conforms to a style of the corresponding style image and is the same as a content of the content image. And then determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set, and taking the converted image with the similarity meeting the preset condition as a target image to obtain the target image set. According to the method and the device, a large number of conversion images are generated according to the content images and the style images, and target images meeting preset conditions are screened out of the conversion images, so that the efficiency and the accuracy of image generation can be improved.

Fig. 2 is a flowchart illustrating another method of generating an image according to an exemplary embodiment, as shown in fig. 2, after step 103, the method may further include:

step 104, determining the conversion effect type of each target image through a pre-trained recognition model based on the target image set, and taking the target image with the conversion effect type of the specified type as a result image.

For example, each target image in the set of target images may be used as input to classify the target image using a pre-trained recognition model to determine the type of conversion effect to which the target image belongs. The recognition model can be understood as a classifier for predicting the type of conversion effect to which the image belongs. For example, the recognition model may first extract the image feature of the target image, and then determine the matching degree of the image feature and each conversion effect type of the plurality of conversion effect types specified in advance, that is, the higher the matching degree of the image feature and the conversion effect type is, the higher the probability value that the target image belongs to the conversion effect type is, the lower the matching degree is, and the lower the probability value that the target image belongs to the conversion effect type is. The recognition model determines that the target image belongs to the conversion effect type with the highest corresponding matching degree, and also determines that the target image belongs to the conversion effect type with the corresponding matching degree meeting the preset condition (for example, the preset number of matching degrees arranged at the forefront according to descending order). The recognition model may be, for example, CNN (english: convolutional Neural Networks, chinese: convolutional neural network) or LSTM (english: long Short-Term Memory network, chinese: long-Short Term Memory network), or MLP (english: multilayer Perceptron Head, chinese: multi-layer sensing network), which is not particularly limited in this disclosure. Finally, the target image with the conversion effect type of the specified type can be used as a result image, and the result image can be one or a plurality of result images.

For example, the pre-specified conversion effect types may include: the identification model can be understood as a multi-classification network, and the corresponding specified types can be: the specified type may be a high quality type, or a low quality type. The pre-specified conversion effect type may also include a high quality type and a low quality type, and the recognition model may be understood as a two-class network, and accordingly, the specified type may be a high quality type and the specified type may be a low quality type. In this way, the result image with the conversion effect type being the specified type can be screened out on the basis of obtaining the target image set.

FIG. 3 is a flowchart illustrating another method of generating an image, according to an exemplary embodiment, as shown in FIG. 3, the implementation of step 102 may include:

in step 1021, histograms of the content image and each of the converted images on a plurality of color channels of the color space are respectively determined in a preset color space.

Step 1022, for each converted image, determining a covariance distance of the content image and the converted image on each color channel according to the histogram of the content image on each color channel and the histogram of the converted image on each color channel.

Step 1023, determining the similarity between the content image and the converted image according to the covariance distance between the content image and the converted image on each color channel.

For example, to determine the similarity between each converted image and the content image, a predetermined color space may be selected, that is, a plurality of color channels corresponding to the color space are determined. A histogram of the content image over the plurality of color channels is then determined, and a histogram of each converted image over the plurality of color channels is determined. Taking the color space as RGB as an example, the corresponding color channels are: red channel, green channel, and blue channel. A histogram of the content image over the red channel may be determined (may be expressed as) Histogram on the green channel (which may be denoted +.>) And a histogram over the blue channel (which may be expressed as +.>) At the same time, it is also possible to determine the histogram of the converted image on the red channel (which can be expressed as +.>) Histogram on the green channel (which may be denoted +.>) And a histogram over the blue channel (which may be expressed as +.>)。

Then, for each converted image, a covariance distance of the content image from the converted image on each color channel may be determined from the histogram of the content image on each color channel and the histogram of the converted image on each color channel. Taking the red channel as an example, the covariance distance of the content image and the converted image on the red channel can be determined by the following formula:

Wherein,representing the covariance distance of the content image from any of the transformed images on the red channel,mean value of histogram representing content image on red channel,/->Standard deviation of histogram representing content image on red channel,/->Mean value of histogram representing converted image on red channel,/->Representing the standard deviation of the histogram of the converted image over the red channel.

Finally, the similarity of the content image to the converted image may be determined based on the covariance distance of the content image to the converted image on each color channel. Specifically, the maximum covariance distance of the content image and the converted image may be taken as the similarity of the content image and the converted image. For example, taking the color space as RGB as an example, for each converted image, the covariance distance between the content image and the converted image on the red channel, the green channel, and the blue channel can be obtained by step 1022, which can be expressed as:then the similarity Coe can be determined by the following equation:

fig. 4 is a flowchart illustrating another image generation method according to an exemplary embodiment, and as shown in fig. 4, step 103 may include:

Step 1031, for each converted image, if the similarity between the converted image and the content image is greater than a preset similarity threshold, taking the converted image as a target image, and adding the target image to the target image set. Or,

step 1032, arranging each of the converted images in the converted image set in descending order of the corresponding similarity, taking the designated number of converted images in the forefront of the arrangement order as the target image, and adding the target image set.

For example, the manner in which the target image is screened out from the converted image set may include two types: one is to set a similarity threshold in advance, then take a converted image whose similarity is larger than the preset similarity threshold as a target image, and put it into a target image set. And the other is to arrange all the converted images in a descending order according to the corresponding similarity, then take the designated number of the converted images with the forefront arrangement order as target images, and put the target images into a target image set. The specified number may be, for example, 100.

Fig. 5 is a flowchart illustrating another image generation method according to an exemplary embodiment, and as shown in fig. 5, step 104 may be implemented by:

step 1041, for each target image, inputting the target image into a recognition model to obtain a conversion effect type of the target image output by the recognition model, where the conversion effect type includes: high quality type and low quality type.

In step 1042, if the conversion effect type of the target image is high quality type, the target image is used as the result image.

By way of example, the pre-specified conversion effect types may include both high quality types and low quality types, and the recognition model may be a two-class network, accordingly. Each target image is input into the recognition model, and the result output by the recognition model has two possibilities, namely a high quality type or a low quality type, for example, a high quality type can be represented by 1, and a low quality type can be represented by 0. If the type of conversion effect of the target image is a high quality type, the target image can be determined to be a result image, which accords with the aesthetic sense.

Fig. 6 is a flowchart illustrating another image generation method according to an exemplary embodiment, and as shown in fig. 6, step 101 may include:

in step 1011, for each style image, the content image and the style image are input into the coding layer in the pre-trained conversion model to obtain the content image feature corresponding to the content image and the style image feature corresponding to the style image.

Step 1012, inputting the content image features and the style image features into a conversion layer in the conversion model to obtain converted image features.

In step 1013, the converted image features are input to a decoding layer in the conversion model to obtain a converted image corresponding to the style image.

For example, the set of conversion images may be obtained by a pre-trained conversion model, the structure of which may be as shown in fig. 7, including: the coding layer can be composed of a pre-trained network, for example, VGG (English: visual Geometry Group Network), resnet and other models are used for coding the input content image and the style image to obtain the content image characteristics corresponding to the content image and the style image characteristics corresponding to the style image.

The conversion layer may be a WCT (english: whiten and Color Transform) module, configured to whiten and color convert the content image feature and the style image feature output by the coding layer, to obtain a converted image feature. And finally, the decoding layer generates a conversion image according to the conversion image characteristics, wherein the decoding layer can be a reverse coding layer, namely the structure of the decoding layer and the structure of the coding layer are in a mirror image relationship.

In particular, the content image may be represented as I _c The style image may be represented as I _s By means of coding layers The resulting content image features may then be denoted as f _c The stylistic image feature may be represented as f _s The conversion layer will f _c Whitening and color converting to obtainFurthermore, the conversion matrix of the style image is reused to obtain the characteristics of the conversion image>Finally, the decoding layer is based onGenerating a converted image I _cs . Further, when training the conversion model, the total loss of the conversion model may include two parts: content loss (which can be understood as loss at the pixel level) and semantic loss (which can be understood as loss in the style dimension). The total loss may be determined, for example, by the following equation:

L＝L _c +λL _s

L _c ＝||I _c -I _cs || ²

L _s ＝||φ(I _c )-φ(I _cs )|| ²

wherein L represents total loss, L _c Representing content loss, L _s Meaning semantic loss, λ denotes a preset weight. Phi (I) _c ) Representing semantic features, phi (I _cs ) Representing semantic features of the transformed image.

FIG. 8 is a flowchart illustrating a training of an identification model, as shown in FIG. 8, according to an exemplary embodiment, the identification model being trained as follows:

and A, generating a sample conversion image set according to the sample image and the plurality of style images, wherein the sample conversion image set comprises sample conversion images corresponding to each style image, and the sample conversion images accord with the styles of the corresponding style images and are the same as the content of the sample images.

And B, taking the sample conversion image set as the input of the recognition model, and taking the actual effect type corresponding to each sample conversion image as the output of the recognition model so as to train the recognition model.

For example, in training an identification model, a sample input set needs to be acquired first. The sample input set includes a plurality of sample inputs, and the sample inputs may be one sample conversion image in the sample conversion image set, and the sample output set includes a sample output corresponding to each sample input, and each sample output includes an actual effect type corresponding to the corresponding sample conversion image, where the actual effect type may be, for example, a high quality type or a low quality type. The acquisition of the sample conversion image set can be obtained according to a sample image and a plurality of style images, wherein the sample image can be an aesthetic image manufactured by an art manufacturer. For example, the sample image and a style image may be formed into an image pair, and the image pair is used as an input of a style conversion model to obtain a sample conversion image corresponding to the style image output by the style conversion model. And repeating the above process until a sample conversion image corresponding to each style image is obtained, namely obtaining a sample conversion image set, wherein the sample conversion image corresponding to each style image is included, the sample conversion image accords with the style of the corresponding style image, and the content of the sample conversion image is the same as that of the sample image.

When training the recognition model, the sample input set (i.e. the sample conversion image set) can be used as the input of the recognition model, and the sample output set can be used as the output of the recognition model, so that the output of the recognition model can be matched with the sample output set when the sample input set is input. For example, the loss may be determined from the output of the recognition model and the sample output set, and the neuron parameters in the recognition model may be corrected by using a back propagation algorithm with the goal of reducing the loss, and the neuron parameters may be, for example, weights (english: weight) and offsets (english: bias) of neurons. Repeating the steps until the loss meets the preset condition, for example, the loss is smaller than a preset loss threshold value, so as to achieve the aim of training the identification model.

Fig. 9 is a block diagram of an image generating apparatus according to an exemplary embodiment, and as shown in fig. 9, the apparatus 200 includes:

the conversion module 201 is configured to generate a conversion image set according to the acquired content image and a plurality of preset style images, where the conversion image set includes a conversion image corresponding to each style image, and the conversion image conforms to the style of the corresponding style image and is the same as the content of the content image.

A determining module 202 is configured to determine a similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the set of converted images.

And the filtering module 203 is configured to use the converted images with the similarity satisfying the preset condition in the converted image set as target images, so as to obtain a target image set including at least one target image.

Fig. 10 is a block diagram of another image generation apparatus shown according to an exemplary embodiment, and as shown in fig. 10, the apparatus 200 may further include:

the recognition module 204 is configured to determine, after the converted images with the similarity satisfying the preset condition in the converted image set are used as target images, a conversion effect type of each target image based on the target image set through a recognition model trained in advance, and use a target image with the conversion effect type being a specified type as a result image.

Fig. 11 is a block diagram of another image generating apparatus according to an exemplary embodiment, and as shown in fig. 11, the determining module 202 may include:

a histogram determination submodule 2021 is configured to determine histograms of the content image and each converted image on a plurality of color channels of the color space, respectively, in a preset color space.

A covariance distance determination submodule 2022 for determining, for each converted image, a covariance distance of the content image from the histogram of the content image on each color channel and the histogram of the converted image on each color channel.

A similarity determination submodule 2023 is configured to determine a similarity of the content image and the converted image according to a covariance distance of the content image and the converted image on each color channel.

In one application scenario, the similarity determination submodule 2023 may be used to:

and taking the maximum covariance distance between the content image and the converted image as the similarity between the content image and the converted image.

In one application scenario, the screening module 203 may be configured to:

and for each converted image, if the similarity between the converted image and the content image is larger than a preset similarity threshold, taking the converted image as a target image, and adding the target image into a target image set. Or,

Each of the converted images in the converted image set is arranged in a corresponding descending order of similarity, and a specified number of the converted images in the forefront of the arrangement order are taken as target images and added to the target image set.

Fig. 12 is a block diagram of another image generation apparatus shown according to an exemplary embodiment, and as shown in fig. 12, the identification module 204 may include:

the recognition submodule 2041 is configured to input, for each target image, the target image into a recognition model to obtain a conversion effect type of the target image output by the recognition model, where the conversion effect type includes: high quality type and low quality type.

The processing submodule 2042 is configured to take the target image as a result image if the conversion effect type of the target image is a high-quality type.

Fig. 13 is a block diagram of another image generation apparatus shown according to an exemplary embodiment, and as shown in fig. 13, the conversion module 201 may include:

the encoding submodule 2011 is configured to input, for each style image, a content image and the style image into an encoding layer in a pre-trained conversion model, so as to obtain a content image feature corresponding to the content image and a style image feature corresponding to the style image.

A conversion submodule 2012 is used for inputting the content image characteristics and the style image characteristics into a conversion layer in the conversion model to obtain converted image characteristics.

And the decoding submodule 2013 is used for inputting the characteristics of the converted image into a decoding layer in the conversion model to obtain the converted image corresponding to the style image.

In one application scenario, the recognition model is trained by:

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Referring now to fig. 14, a schematic diagram of a configuration of an electronic device (which may be, for example, an execution body of an embodiment of the present disclosure, a terminal device or a server) 300 suitable for use in implementing an embodiment of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 14 is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.

As shown in fig. 14, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 301, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.

In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 308 including, for example, magnetic tape, hard disk, etc.; and communication means 309. The communication means 309 may allow the electronic device 300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 14 illustrates an electronic device 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via a communication device 309, or installed from a storage device 308, or installed from a ROM 302. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 301.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the terminal devices, servers, may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are identical to the content of the content image; determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set; and taking the converted images with the similarity meeting the preset condition in the converted image set as target images to obtain a target image set comprising at least one target image.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented in software or hardware. The name of a module is not limited to the module itself in some cases, and for example, a conversion module may be described as a "module that generates a converted image set".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, example 1 provides a method of generating an image, including: generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are identical to the content of the content image; determining the similarity between each converted image and the content image according to the histogram of the content image and the histogram of each converted image in the converted image set; and taking the converted images with the similarity meeting the preset condition in the converted image set as target images to obtain a target image set comprising at least one target image.

According to one or more embodiments of the present disclosure, example 2 provides the method of example 1, wherein after the converting the converted image in which the similarity satisfies a preset condition is set as the target image, the method further includes: and determining the conversion effect type of each target image through a pre-trained recognition model based on the target image set, and taking the target image with the conversion effect type of a specified type as a result image.

According to one or more embodiments of the present disclosure, example 3 provides the method of example 1, the determining a similarity of each of the converted images to the content image from the histogram of the content image and the histogram of each of the converted images in the converted image set, comprising: respectively determining histograms of the content image and each conversion image on a plurality of color channels of a preset color space; determining, for each of the converted images, a covariance distance of the content image and the converted image on each color channel according to the histogram of the content image on each color channel and the histogram of the converted image on each color channel; and determining the similarity of the content image and the conversion image according to the covariance distance of the content image and the conversion image on each color channel.

According to one or more embodiments of the present disclosure, example 4 provides the method of example 3, the determining the similarity of the content image to the converted image according to the covariance distance of the content image to the converted image on each color channel, comprising: and taking the maximum covariance distance between the content image and the converted image as the similarity between the content image and the converted image.

According to one or more embodiments of the present disclosure, example 5 provides the method of example 1, wherein the converting the converted image set, the converted image having a similarity satisfying a preset condition, is used as a target image to obtain a target image set including at least one of the target images, including: for each converted image, if the similarity between the converted image and the content image is greater than a preset similarity threshold, taking the converted image as the target image, and adding the target image into the target image set; or, arranging each of the converted images in the converted image set in a corresponding similarity descending order, taking a specified number of the converted images in the forefront of the arrangement order as the target image, and adding the specified number of the converted images to the target image set.

According to one or more embodiments of the present disclosure, example 6 provides the method of example 2, the determining, based on the target image set, a conversion effect type of each of the target images by a recognition model trained in advance, and taking the target image having the conversion effect type of a specified type as a result image, including: inputting the target image into the recognition model for each target image to obtain a conversion effect type of the target image output by the recognition model, wherein the conversion effect type comprises: a high quality type and a low quality type; and if the conversion effect type of the target image is a high-quality type, taking the target image as the result image.

According to one or more embodiments of the present disclosure, example 7 provides the method of example 1, the generating a converted image set according to the acquired content image and the preset plurality of style images, including: inputting the content image and the style image into a coding layer in a pre-trained conversion model aiming at each style image so as to obtain content image characteristics corresponding to the content image and style image characteristics corresponding to the style image; inputting the content image features and the style image features into a conversion layer in a conversion model to obtain converted image features; and inputting the characteristics of the converted image into a decoding layer in the conversion model to obtain the converted image corresponding to the style image.

Example 8 provides the method of example 2, according to one or more embodiments of the present disclosure, the recognition model being trained by: generating a sample conversion image set according to a sample image and a plurality of style images, wherein the sample conversion image set comprises a sample conversion image corresponding to each style image, and the sample conversion image accords with the style of the corresponding style image and has the same content as the sample image; and taking the sample conversion image set as the input of the recognition model, and taking the actual effect type corresponding to each sample conversion image as the output of the recognition model so as to train the recognition model.

According to one or more embodiments of the present disclosure, example 9 provides an image generating apparatus, including: the conversion module is used for generating a conversion image set according to the acquired content image and a plurality of preset style images, wherein the conversion image set comprises conversion images corresponding to each style image, and the conversion images accord with the styles of the corresponding style images and are the same as the content of the content image; a determining module, configured to determine a similarity between each of the converted images and the content image according to the histogram of the content image and the histogram of each of the converted images in the converted image set; and the screening module is used for taking the converted images with the similarity meeting the preset condition in the converted image set as target images so as to obtain a target image set comprising at least one target image.

According to one or more embodiments of the present disclosure, example 10 provides a computer-readable medium having stored thereon a computer program which, when executed by a processing device, implements the steps of the methods described in examples 1 to 8.

Example 11 provides an electronic device according to one or more embodiments of the present disclosure, comprising: a storage device having a computer program stored thereon; processing means for executing the computer program in the storage means to realize the steps of the method described in examples 1 to 8.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims. The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

Claims

1. A method of generating an image, the method comprising:

respectively determining histograms of the content image and each conversion image on a plurality of color channels of a preset color space;

determining, for each of the converted images, a covariance distance of the content image and the converted image on each color channel according to the histogram of the content image on each color channel and the histogram of the converted image on each color channel;

Determining the similarity of the content image and the conversion image according to the covariance distance between the content image and the conversion image on each color channel;

2. The method according to claim 1, wherein after said converting the converted image in which the similarity satisfies a preset condition in the converted image set as a target image, the method further comprises:

and determining the conversion effect type of each target image through a pre-trained recognition model based on the target image set, and taking the target image with the conversion effect type of a specified type as a result image.

3. The method of claim 1, wherein determining the similarity of the content image to the converted image based on the covariance distance of the content image to the converted image on each color channel comprises:

4. The method according to claim 1, wherein the converting the converted image, in which the similarity satisfies a preset condition, in the converted image set, as a target image, includes:

for each converted image, if the similarity between the converted image and the content image is greater than a preset similarity threshold, taking the converted image as the target image, and adding the target image into the target image set; or,

each of the converted images in the converted image set is arranged in a corresponding descending order of similarity, and a specified number of the converted images in the forefront of the arrangement order are taken as the target images and added to the target image set.

5. The method according to claim 2, wherein the determining, based on the target image set, a conversion effect type of each of the target images by a recognition model trained in advance, and taking the target image of which conversion effect type is a specified type as a result image includes:

inputting the target image into the recognition model for each target image to obtain a conversion effect type of the target image output by the recognition model, wherein the conversion effect type comprises: a high quality type and a low quality type;

And if the conversion effect type of the target image is a high-quality type, taking the target image as the result image.

6. The method of claim 1, wherein generating the converted image set from the acquired content image and the preset plurality of style images comprises:

inputting the content image and the style image into a coding layer in a pre-trained conversion model aiming at each style image so as to obtain content image characteristics corresponding to the content image and style image characteristics corresponding to the style image;

inputting the content image features and the style image features into a conversion layer in a conversion model to obtain converted image features;

and inputting the characteristics of the converted image into a decoding layer in the conversion model to obtain the converted image corresponding to the style image.

7. The method according to claim 2, wherein the recognition model is trained by:

generating a sample conversion image set according to a sample image and a plurality of style images, wherein the sample conversion image set comprises a sample conversion image corresponding to each style image, and the sample conversion image accords with the style of the corresponding style image and has the same content as the sample image;

And taking the sample conversion image set as the input of the recognition model, and taking the actual effect type corresponding to each sample conversion image as the output of the recognition model so as to train the recognition model.

8. An image generation apparatus, the apparatus comprising:

a determining module, configured to determine histograms of the content image and each of the converted images on a plurality of color channels of a preset color space, respectively; determining, for each of the converted images, a covariance distance of the content image and the converted image on each color channel according to the histogram of the content image on each color channel and the histogram of the converted image on each color channel; determining the similarity of the content image and the conversion image according to the covariance distance between the content image and the conversion image on each color channel;

9. A computer readable medium on which a computer program is stored, characterized in that the program, when being executed by a processing device, carries out the steps of the method according to any one of claims 1-7.

10. An electronic device, comprising:

a storage device having a computer program stored thereon;

processing means for executing said computer program in said storage means to carry out the steps of the method according to any one of claims 1-7.