CN109949213B - Method and apparatus for generating image - Google Patents

Method and apparatus for generating image Download PDF

Info

Publication number
CN109949213B
CN109949213B CN201910199011.XA CN201910199011A CN109949213B CN 109949213 B CN109949213 B CN 109949213B CN 201910199011 A CN201910199011 A CN 201910199011A CN 109949213 B CN109949213 B CN 109949213B
Authority
CN
China
Prior art keywords
image
type information
facial organ
user
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910199011.XA
Other languages
Chinese (zh)
Other versions
CN109949213A (en
Inventor
田飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Moxing Times Technology Co ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910199011.XA priority Critical patent/CN109949213B/en
Publication of CN109949213A publication Critical patent/CN109949213A/en
Application granted granted Critical
Publication of CN109949213B publication Critical patent/CN109949213B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Embodiments of the present disclosure disclose methods and apparatus for generating images. One embodiment of the method comprises the following steps: acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, and the image generation models are used for generating target images of image types represented by the corresponding image type information; determining an image generation model corresponding to the acquired image type information from the image generation model set; and inputting the initial image into the determined image generation model to generate a target image. The embodiment enriches the generation modes of the images, and can generate different types of images according to different requirements of users.

Description

Method and apparatus for generating image
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for generating an image.
Background
In the prior art, since the demands of each user are often different, the favorite images thereof are often different. However, some users cannot design their satisfactory images by themselves. As can be seen, there is a need in the art for type conversion of a user-selected initial image.
The prior art scheme is usually only to carry out image processing such as beautifying and filtering on an initial image.
Disclosure of Invention
The present disclosure proposes methods and apparatus for generating images.
In a first aspect, embodiments of the present disclosure provide a method for generating an image, the method comprising: acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, and the image generation models are used for generating target images of image types represented by the corresponding image type information; determining an image generation model corresponding to the acquired image type information from the image generation model set; and inputting the initial image into the determined image generation model to generate a target image.
In some embodiments, for image type information in the image type information set, an image generation model corresponding to the image type information is trained by: acquiring a training sample set, wherein the training sample comprises a sample initial image and a sample target image corresponding to the sample initial image, and the sample target image is an image of an image type represented by the image type information; and using a machine learning algorithm, taking a sample initial image included in a training sample in the training sample set as input, taking a sample target image corresponding to the input sample initial image as expected output, and training to obtain an image generation model.
In some embodiments, the initial image is a facial image; the method further comprises: determining a region size of at least one facial organ image region included in the initial image input by the user; in response to determining that the region size of the facial organ image region is equal to or larger than a size threshold determined in advance for the facial organ image region, a facial organ image located in the facial organ image region is subjected to a morphing process for the region size of at least one facial organ image region.
In some embodiments, the method further comprises: acquiring facial organ information selected by a user from a predetermined facial organ information set and deformation information selected by the user from a deformation information set determined for the facial organ information set; determining a facial organ deformation model from a pre-trained facial organ deformation model set according to facial organ information and deformation information selected by a user; and inputting the generated target image into the determined facial organ deformation model to obtain a deformed image of the initial image.
In some embodiments, the image generation model generates an impedance network.
In a second aspect, embodiments of the present disclosure provide an apparatus for generating an image, the apparatus comprising: the first acquisition unit is configured to acquire an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, and the image generation models are used for generating target images of image types characterized by the corresponding image type information; a first determination unit configured to determine an image generation model corresponding to the acquired image type information from the image generation model set; and a generating unit configured to input the initial image to the determined image generation model and generate a target image.
In some embodiments, for the image type information in the image type information set, the image generation model corresponding to the image type information is obtained through training by the following steps: acquiring a training sample set, wherein the training sample comprises a sample initial image and a sample target image corresponding to the sample initial image, and the sample target image is an image of an image type represented by the image type information; and using a machine learning algorithm, taking a sample initial image included in a training sample in the training sample set as input, taking a sample target image corresponding to the input sample initial image as expected output, and training to obtain an image generation model.
In some embodiments, the initial image is a facial image; the apparatus further comprises: a second determination unit configured to determine a region size of at least one facial organ image region included in the initial image input by the user; a processing unit configured to, for a region size of at least one facial organ image region, perform a morphing process on a facial organ image located in the facial organ image region in response to determining that the region size of the facial organ image region is equal to or larger than a size threshold determined in advance for the facial organ image region.
In some embodiments, the apparatus further comprises: a second acquisition unit configured to acquire facial organ information selected by a user from a predetermined facial organ information set, and deformation information selected by the user from a deformation information set determined for the facial organ information set; a third determination unit configured to determine a facial organ deformation model from a pre-trained facial organ deformation model set based on facial organ information and deformation information selected by a user; and an input unit configured to input the generated target image to the determined facial organ deformation model, resulting in a deformed image of the initial image.
In some embodiments, the image generation model generates an impedance network.
In a third aspect, embodiments of the present disclosure provide an electronic device for generating an image, comprising: one or more processors; and a storage device having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement a method as in any of the embodiments of the method for generating an image described above.
In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium for generating an image, having stored thereon a computer program which, when executed by a processor, implements a method as in any of the embodiments of the method for generating an image described above.
The method and the device for generating the image are provided by the embodiment of the disclosure, through acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to the image generation models in the pre-trained image generation model set one by one, the image generation models are used for generating target images of image types represented by the corresponding image type information, then, the image generation models corresponding to the acquired image type information are determined from the image generation model set, finally, the initial image is input into the determined image generation models to generate the target images, the generation modes of the images are enriched, and different types of images can be generated according to different requirements of the user.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;
FIG. 2 is a flow chart of one embodiment of a method for generating an image according to the present disclosure;
FIG. 3 is a schematic illustration of one application scenario of a method for generating an image according to the present disclosure;
FIG. 4 is a flow chart of yet another embodiment of a method for generating an image according to the present disclosure;
FIG. 5 is a schematic structural view of one embodiment of an apparatus for generating an image according to the present disclosure;
fig. 6 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of methods for generating images or apparatuses for generating images of embodiments of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as an image processing class application, a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and image processing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server processing an initial image transmitted by the terminal devices 101, 102, 103, such as Style migration (Style transfer). The background server may perform a style migration or the like on the received data such as the initial image, thereby generating a processing result (for example, an image obtained by performing style migration on the initial image).
It should be noted that the method for generating an image provided by the embodiment of the present disclosure may be performed by the server 105 or may be performed by the terminal devices 101, 102, 103. Accordingly, the means for generating an image may be provided in the server 105 or in the terminal devices 101, 102, 103. Optionally, the method for generating an image provided by the embodiment of the present disclosure may be further performed by the server and the terminal device in cooperation with each other, and each unit included in the apparatus for generating an image may be further disposed in the server and the terminal device, respectively.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. The system architecture may include only the electronic device on which the method for generating an image is to be run when the electronic device on which the method for generating an image is to be run does not require data transmission with other electronic devices.
With continued reference to fig. 2, a flow 200 of one embodiment of a method for generating an image according to the present disclosure is shown. The method for generating an image comprises the steps of:
step 201, acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set.
In this embodiment, an execution subject of the method for generating an image (e.g., a server shown in fig. 1) may acquire an initial image input by a user from another electronic device or locally through a wired connection or a wireless connection. Similarly, the executing body may also obtain the image type information selected by the user from the predetermined image type information set from other electronic devices or locally through a wired connection manner or a wireless connection manner.
Wherein the image type information in the image type information set corresponds to the image generation model in the pre-trained image generation model set one by one. The image generation model is used for generating a target image of the image type characterized by the corresponding image type information.
Here, the above-described initial image may be an image for which style migration is to be performed. For example, the initial image may be, but is not limited to: facial images, body images, scenery images, animal images, plant images, and the like. The image type information may be used to characterize the image type. By way of example, the image type information in the set of image type information described above may be used to characterize, but is not limited to, any of the following image types: cartoon style, canvas style, abstract style, english style, and the like. It will be appreciated that style migration is a technique that transforms the type of image. For example, the initial image may be a self-shot image of the user, where the image is subjected to style migration, and a self-shot image of a cartoon style, a self-shot image of a canvas style, a self-shot image of an abstract genre style, a self-shot image of an english style, and the like may be obtained.
In practice, the user may input an initial image to the terminal device used by the user by photographing, downloading, or the like. Thus, when the execution subject is a terminal device, the execution subject can locally acquire an initial image input by a user; when the execution subject is a server, the execution subject may be communicatively connected to a terminal device used by a user, and thus the execution subject may acquire an initial image input by the user from the communicatively connected terminal device.
As an example, the above-described execution subject may acquire image type information selected by a user from a predetermined image type information set in the following manner: the terminal device used by the user may be presented with each image type information in the image type information set, and thus the execution subject may acquire the image type information selected by the user from the image type information set by detecting a selection operation of the user (for example, a click, a long press, an input of the image type information, or the like operation of the image type information).
It is understood that the execution subject, or the electronic device communicatively connected to the execution subject, may store the image type information in the image type information set in association with the image generation model in the image generation model set, so as to establish a correspondence therebetween.
In some optional implementations of this embodiment, for the image type information in the image type information set, the image generation model corresponding to the image type information may be the execution subject or an electronic device communicatively connected to the execution subject, and the image generation model is obtained through training through the following steps:
first, a training sample set is obtained. The training sample comprises a sample initial image and a sample target image corresponding to the sample initial image. The sample target image is an image of the image type characterized by the image type information.
Here, the sample initial image may be an image before performing style migration for training to obtain an image generation model corresponding to the image type information. The sample target image corresponding to the sample initial image may be an image obtained after the style migration of the image type represented by the image type information of the sample initial image. For example, if the image type characterized by the image type information is "animation style", then the sample target image corresponding to the sample initial image may be a sample initial image of animation style.
And secondly, taking a sample initial image included in a training sample in the training sample set as input, taking a sample target image corresponding to the input sample initial image as expected output, and training to obtain an image generation model by using a machine learning algorithm. The sample target image may be an image of an image type characterized by the image type information. In practice, the sample target image may be drawn by a drawing person making a change in the image type of the initial image.
Specifically, the execution subject may take, as input of an initial model (for example, a convolutional neural network), a sample initial image included in a training sample in the training sample set, and obtain an actual output corresponding to the input sample initial image. It is determined whether the initial model satisfies a predetermined training end condition. If so, determining an initial model meeting a predetermined training ending condition to generate a model for the training-completed image. If not, a back propagation method and a gradient descent method are adopted to adjust parameters of the initial model based on actual output and expected output corresponding to the input sample initial image, and the initial model with the adjusted parameters is used for the next training. Wherein, the training ending condition can include, but is not limited to, at least one of the following: the training time exceeds the preset duration, the training times exceeds the preset times, and the function value obtained by inputting the actual output and the expected output into the predetermined loss function is smaller than the preset threshold.
It can be appreciated that the present alternative implementation may employ a supervised training method, where for each image type information in the set of image type information, an image generation model is trained.
In some alternative implementations of the present embodiment, the image generation model may also generate an impedance network. The generation of the countermeasure network comprises a generation network and a discrimination network, wherein the generation network is used for representing the corresponding relation between the initial image and the target image, and the discrimination network is used for determining whether the input target image is the generated target image or the real target image. As an example, the generation countermeasure network described above may be a loop generation countermeasure network (CycleGan). The target image is an initial image of the image type represented by the image type information corresponding to the image generation model.
It will be appreciated that when the image generation model may also generate an impedance network, the present alternative implementation may employ an unsupervised training method, where for each image type information in the set of image type information, an image generation model is trained.
Step 202, determining an image generation model corresponding to the acquired image type information from the image generation model set.
In this embodiment, the execution subject may determine an image generation model corresponding to the acquired image type information from the image generation model set.
It is to be understood that the execution subject may determine, from the image generation model set, an image generation model in which an association relationship is established in advance with the acquired image type information as the image generation model to which it corresponds.
And 203, inputting the initial image into the determined image generation model to generate a target image.
In this embodiment, the execution subject may input the initial image acquired in step 201 to the image generation model determined in step 202, so as to generate the target image. The target image may be an image of an image type represented by the image type information corresponding to the image generation model.
Therefore, according to the embodiment, the style migration can be performed on the initial image input by the user according to the requirement of the user, so that the target image (namely, the image obtained after the style migration is performed on the initial image) meeting the requirement of the user is obtained.
In some alternative implementations of the present embodiment, the initial image is a facial image. Thus, the execution body may further execute the steps of:
step one, determining a region size of at least one facial organ image region included in an initial image input by a user. Wherein the facial organ includes, but is not limited to, at least one of: eyes, nose, forehead, ears, mouth, eyebrows, eyelashes, cheeks, and the like.
As an example, the execution subject may determine, for each of at least one facial organ image area included in the initial image input by the user, an area size of the facial organ image area.
As yet another example, the execution subject may determine a region size of the facial organ image region selected by the user among at least one facial organ image region included in the initial image input by the user.
As an example, the above-mentioned region size may be characterized by the number of pixel points included in the facial organ image region, or may be characterized by the ratio of the area of the facial organ image region to the area of the above-mentioned initial image input by the user.
And secondly, aiming at the area size of at least one facial organ image area, performing deformation processing on the facial organ image positioned in the facial organ image area in response to determining that the area size of the facial organ image area is larger than or equal to a size threshold value which is determined in advance for the facial organ image area.
Here, the execution subject may apply a triangle affine transformation algorithm to deform the facial organ image, or may input the facial organ image into a pre-trained deformation model to deform the facial organ image. The deformation model can be used for performing deformation processing on an input image. For example, the deformation model may be a convolutional neural network model trained using a machine learning algorithm.
Wherein the technician may set a size threshold for each facial organ included in the face in advance. As an example, a technician may set the size threshold of the facial organ "eye" to the size of any one eye or the average of any multiple eye sizes. The deformation process may include, but is not limited to: zoom in, zoom out, pan, rotate, twist, etc.
In some optional implementations of this embodiment, the foregoing execution body may further execute the following steps:
step one, facial organ information selected by a user from a predetermined facial organ information set is acquired. And acquiring deformation information selected by a user from the deformation information set determined for the facial organ information set.
Wherein the facial organ information is used to indicate facial organs. Here, the facial organ information may be characterized by words, for example, the facial organ information may be "nose". Alternatively, the facial organ information may be characterized by other forms, for example, the facial organ information may be characterized by an image. The deformation information may be used to indicate a particular deformation process. Here, the deformation information may be characterized by text, for example, the deformation information may be "enlarged". Alternatively, the deformation information may be characterized by other forms, for example, the deformation information may be characterized by an image (e.g., an enlarged image of the eye).
It should be noted that, a technician may set one or more pieces of deformation information corresponding to each piece of facial organ information in the facial organ information set, or may set the facial organ information set and the deformation information set respectively, so that any one piece of facial organ information in the facial organ information set corresponds to any one piece of deformation information in the deformation information set.
Here, when the execution subject is a terminal device, it may present each of the facial organ information in the facial organ information set, whereby the user may select the facial organ information from the facial organ information set presented by the execution subject so that the execution subject acquires the facial organ information selected by the user. When the execution subject is a server, it may send a set of facial organ information to a terminal device communicatively connected thereto so that the terminal device presents each facial organ information in the received set of facial organ information, whereby a user may select facial organ information from the set of facial organ information presented by the execution subject, and thereafter, the terminal device may send the facial organ information selected by the user to the execution subject again so that the execution subject obtains the facial organ information selected by the user.
It will be appreciated that, for the deformation information, the above-described execution body may be acquired in a similar manner to the facial organ information, and will not be described in detail herein.
And step two, determining a facial organ deformation model from a pre-trained facial organ deformation model set according to facial organ information and deformation information selected by a user.
Wherein the facial organ deformation model may be used to deform the facial organ in the initial image. As an example, the facial organ deformation model may be a neural network model trained by a machine learning algorithm, or may be a triangle radiation transformation algorithm.
And thirdly, inputting the generated target image into the determined facial organ deformation model to obtain a deformed image of the initial image.
With continued reference to fig. 3A-3C, fig. 3A-3C are schematic diagrams of one application scenario of the method for generating an image according to the present embodiment. In fig. 3A, the handset obtains an initial image 301 entered by the user, and image type information 3020 selected by the user from a predetermined set of image type information 302. Wherein the image type information in the image type information set 301 corresponds one-to-one to the image generation models in the pre-trained image generation model set. The image generation model is used for generating a target image of the image type characterized by the corresponding image type information. Then, referring to fig. 3B, the mobile phone determines an image generation model 304 corresponding to the acquired image type information 3020 from among a set of image generation models (for example, including an image generation model corresponding to the image type information "oil painting", an image generation model corresponding to the image type information "cartoon", an image generation model corresponding to the image type information "intel", an image generation model corresponding to the image type information "gothic", an image generation model corresponding to the image type information "sand painting", and an image generation model corresponding to the image type information "black and white"). Finally, the mobile phone inputs the initial image 301 to the determined image generation model 304, generating the target image 303. Optionally, referring to fig. 3C, the handset presents the user with a target image 303 generated by the canvas style migration of the initial image 301 on the screen.
In the prior art, as the requirements of each user often differ, the head portraits required by the users are often different. However, most users cannot design their own satisfactory head portraits. As can be seen, there is a need in the art for type conversion of a user-selected initial image.
According to the method provided by the embodiment of the disclosure, the initial image input by the user and the image type information selected by the user from the predetermined image type information set are acquired, then the image generation model corresponding to the acquired image type information is determined from the image generation model set, finally the initial image is input into the determined image generation model to generate the target image, and style migration can be performed on the initial image input by the user according to the requirement of the user, so that the image meeting the requirement of the user is obtained, and the generation mode of the image is enriched.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating an image is shown. The flow 400 of the method for generating an image comprises the steps of:
step 401, acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set.
In this embodiment, an execution subject of the method for generating an image (e.g., a server or a terminal device shown in fig. 1) may acquire an initial image input by a user from another electronic device or locally through a wired connection or a wireless connection. Similarly, the executing body may also obtain the image type information selected by the user from the predetermined image type information set from other electronic devices or locally through a wired connection manner or a wireless connection manner. The initial image is a facial image.
Wherein the image type information in the image type information set corresponds to the image generation models in the pre-trained image generation model set one by one, and the image generation models are used for generating target images of the image types represented by the corresponding image type information
Step 402, determining an image generation model corresponding to the acquired image type information from the image generation model set.
In this embodiment, the execution subject may determine an image generation model corresponding to the acquired image type information from the image generation model set.
Step 403, inputting the initial image into the determined image generation model to generate a target image.
In this embodiment, the execution subject may input the initial image acquired in step 401 to the image generation model determined in step 402, so as to generate the target image. The target image may be an image of an image type represented by the image type information corresponding to the image generation model.
Here, in addition to the above, the steps 401 to 403 may further include contents substantially identical to the steps 201 to 203 in the corresponding embodiment of fig. 2, which are not described herein.
Step 404 determines a region size of at least one facial organ image region comprised by the initial image entered by the user.
In the present embodiment, the above-described execution subject may determine the region size of at least one facial organ image region included in the initial image input by the user. Wherein the facial organ includes, but is not limited to, at least one of: eyes, nose, forehead, ears, mouth, eyebrows, eyelashes, cheeks, and the like.
As an example, the execution subject may determine, for each of at least one facial organ image area included in the initial image input by the user, an area size of the facial organ image area.
As yet another example, the execution subject may determine a region size of the facial organ image region selected by the user among at least one facial organ image region included in the initial image input by the user.
As an example, the above-mentioned region size may be characterized by the number of pixel points included in the facial organ image region, or may be characterized by the ratio of the area of the facial organ image region to the area of the above-mentioned initial image input by the user.
In step 405, for a region size of at least one facial organ image region, in response to determining that the region size of the facial organ image region is equal to or greater than a size threshold determined in advance for the facial organ image region, a facial organ image located in the facial organ image region is subjected to a deformation process.
In this embodiment, the execution body may further perform the deformation processing on the facial organ image located in the facial organ image area in response to determining that the area size of the facial organ image area is equal to or larger than the size threshold determined in advance for the facial organ image area with respect to the area size of the at least one facial organ image area.
Here, the execution subject may apply a triangle affine transformation algorithm to deform the facial organ image, or may input the facial organ image into a pre-trained deformation model to deform the facial organ image. The deformation model can be used for performing deformation processing on an input image. For example, the deformation model may be a convolutional neural network model trained using a machine learning algorithm.
Wherein the technician may set a size threshold for each facial organ included in the face in advance. As an example, a technician may set the size threshold of the facial organ "eye" to the size of any one eye or the average of any multiple eye sizes. The deformation process may include, but is not limited to: zoom in, zoom out, pan, rotate, twist, etc.
As can be seen from fig. 4, the procedure 400 of the method for generating an image in this embodiment highlights the step of deformation processing of an image of a facial organ, compared to the corresponding embodiment of fig. 2. Therefore, the scheme described in the embodiment can further deform the facial organ image in the facial image on the basis of realizing style migration of the facial image. For example, if the size of the image area where the face image characterizes the nose is greater than or equal to a predetermined nose size threshold (for example, the average of the sizes of the noses of a plurality of people), the nose of the face image may be subjected to deformation processing (for example, amplification processing) by the scheme described in this embodiment, so as to obtain a face image in which the nose is stylized (for example, cartoon style) and subjected to deformation processing (for example, amplification processing). Therefore, the personalized requirements of the user can be further met, and the generation mode of the image is further enriched.
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an apparatus for generating an image, which corresponds to the method embodiment shown in fig. 2, and which may include the same or corresponding features as the method embodiment shown in fig. 2, except for the features described below. The device can be applied to various electronic equipment.
As shown in fig. 5, the apparatus 500 for generating an image of the present embodiment includes: a first acquisition unit 501, a first determination unit 502, and a generation unit 503. Wherein the first obtaining unit 501 is configured to obtain an initial image input by a user, and image type information selected by the user from a predetermined image type information set, where the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, and the image generation models are used for generating target images of image types characterized by the corresponding image type information; the first determining unit 502 is configured to determine an image generation model corresponding to the acquired image type information from the image generation model set; the generating unit 503 is configured to input an initial image to the determined image generation model, generating a target image.
In this embodiment, the first acquiring unit 501 of the apparatus 500 for generating an image may acquire an initial image input by a user from another electronic device or locally through a wired connection or a wireless connection. Similarly, the apparatus 500 may also obtain, from other electronic devices or locally, the image type information selected by the user from the predetermined set of image type information through a wired connection or a wireless connection.
Wherein the image type information in the image type information set corresponds to the image generation model in the pre-trained image generation model set one by one. The image generation model is used for generating a target image of the image type characterized by the corresponding image type information.
In this embodiment, the first determining unit 502 may determine an image generation model corresponding to the acquired image type information from the image generation model set.
In this embodiment, the generating unit 503 may input the initial image acquired by the first acquiring unit 501 to the image generating model determined by the first determining unit 502, thereby generating the target image. The target image may be an image of an image type represented by the image type information corresponding to the image generation model.
In some optional implementations of this embodiment, for the image type information in the image type information set, the image generation model corresponding to the image type information is trained by:
firstly, acquiring a training sample set, wherein the training sample comprises a sample initial image and a sample target image corresponding to the sample initial image, and the sample target image is an image of an image type represented by the image type information;
then, using a machine learning algorithm, taking a sample initial image included in a training sample in the training sample set as input, taking a sample target image corresponding to the input sample initial image as expected output, and training to obtain an image generation model.
In some alternative implementations of the present embodiment, the initial image is a facial image. The apparatus 500 further comprises: the second determination unit (not shown in the figure) is configured to determine a region size of at least one facial organ image region included in the initial image input by the user. The processing unit (not shown in the figure) is configured to perform a morphing process on a facial organ image located in at least one facial organ image region in response to determining that the region size of the facial organ image region is equal to or larger than a size threshold determined in advance for the facial organ image region.
In some optional implementations of this embodiment, the apparatus 500 further includes: the second acquisition unit (not shown in the figure) is configured to acquire facial organ information selected by the user from a predetermined set of facial organ information, and deformation information selected by the user from a set of deformation information determined for the set of facial organ information. The third determining unit (not shown in the figure) is configured to determine a facial organ deformation model from a pre-trained facial organ deformation model set based on the facial organ information and the deformation information selected by the user. An input unit (not shown in the figure) is configured to input the generated target image to the determined facial organ deformation model, resulting in a deformed image of the initial image.
In some alternative implementations of the present embodiment, the image generation model generates an impedance network.
The apparatus provided in the foregoing embodiment of the present disclosure acquires, through the first acquiring unit 501, an initial image input by a user, and image type information selected by the user from a predetermined image type information set, where the image type information in the image type information set corresponds to the image generation models in the pre-trained image generation model set one by one, the image generation models are used to generate target images of image types represented by the corresponding image type information, then the first determining unit 502 determines, from the image generation model set, an image generation model corresponding to the acquired image type information, and finally the generating unit 503 inputs the initial image to the determined image generation model to generate the target image, and may perform style migration on the initial image input by the user according to the user's requirement, so as to obtain an image meeting the user's requirement, thereby enriching the image generation manner.
Referring now to FIG. 6, a schematic diagram of a computer system 600 suitable for use in implementing an electronic device of an embodiment of the present disclosure is shown. The electronic device shown in fig. 6 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the methods of the present disclosure are performed when the computer program is executed by a Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Python, java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a first acquisition unit, a first determination unit, and a generation unit. The names of these units do not constitute a limitation of the unit itself in some cases, and for example, the first determination unit may also be described as "a unit that determines an image generation model corresponding to the acquired image type information".
As another aspect, the present disclosure also provides a computer-readable medium that may be contained in the electronic device described in the above embodiments; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, and the image generation models are used for generating target images of image types represented by the corresponding image type information; determining an image generation model corresponding to the acquired image type information from the image generation model set; and inputting the initial image into the determined image generation model to generate a target image.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which features described above or their equivalents may be combined in any way without departing from the spirit of the invention. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (10)

1. A method for generating an image, comprising:
acquiring an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, the image generation models are used for generating target images of image types represented by the corresponding image type information, and the initial image is a face image;
determining an image generation model corresponding to the acquired image type information from the image generation model set;
Inputting the initial image into the determined image generation model to generate a target image;
wherein the method further comprises: determining a region size of at least one facial organ image region comprised by the initial image input by the user, wherein the region size characterizes a ratio of an area of the facial organ image region to an area of the initial image;
and performing deformation processing on a facial organ image positioned in the facial organ image area according to the area size of the at least one facial organ image area, wherein the area size of the facial organ image area is larger than or equal to a size threshold value which is determined in advance for the facial organ image area, and different threshold values are corresponding to different facial organ images.
2. The method according to claim 1, wherein, for the image type information in the image type information set, an image generation model corresponding to the image type information is trained by:
acquiring a training sample set, wherein the training sample comprises a sample initial image and a sample target image corresponding to the sample initial image, and the sample target image is an image of an image type represented by the image type information;
And using a machine learning algorithm, taking a sample initial image included in a training sample in the training sample set as input, taking a sample target image corresponding to the input sample initial image as expected output, and training to obtain an image generation model.
3. The method according to claim 1 or 2, wherein the method further comprises:
acquiring facial organ information selected by a user from a predetermined facial organ information set and deformation information selected by the user from a deformation information set determined for the facial organ information set;
determining a facial organ deformation model from a pre-trained facial organ deformation model set according to facial organ information and deformation information selected by a user;
and inputting the generated target image into the determined facial organ deformation model to obtain a deformed image of the initial image.
4. The method of claim 1, wherein the image generation model is to generate an countermeasure network.
5. An apparatus for generating an image, comprising:
the first acquisition unit is configured to acquire an initial image input by a user and image type information selected by the user from a predetermined image type information set, wherein the image type information in the image type information set corresponds to image generation models in a pre-trained image generation model set one by one, the image generation models are used for generating target images of image types represented by the corresponding image type information, and the initial image is a face image;
A first determination unit configured to determine an image generation model corresponding to the acquired image type information from the image generation model set;
a generation unit configured to input the initial image to the determined image generation model, generating a target image;
wherein the apparatus further comprises: a second determination unit configured to determine a region size of at least one facial organ image region included in an initial image input by the user, wherein the region size characterizes a ratio of an area of the facial organ image region to an area of the initial image;
a processing unit configured to, for a region size of the at least one facial organ image region, perform a morphing process on a facial organ image located in the facial organ image region, different ones of the facial organ images being different thresholds, in response to determining that the region size of the facial organ image region is equal to or greater than a size threshold determined in advance for the facial organ image region.
6. The apparatus of claim 5, wherein, for the image type information in the image type information set, an image generation model corresponding to the image type information is trained by:
Acquiring a training sample set, wherein the training sample comprises a sample initial image and a sample target image corresponding to the sample initial image, and the sample target image is an image of an image type represented by the image type information;
and using a machine learning algorithm, taking a sample initial image included in a training sample in the training sample set as input, taking a sample target image corresponding to the input sample initial image as expected output, and training to obtain an image generation model.
7. The apparatus of claim 5 or 6, wherein the apparatus further comprises:
a second acquisition unit configured to acquire face organ information selected by a user from a predetermined set of face organ information, and deformation information selected by the user from a set of deformation information determined for the set of face organ information;
a third determination unit configured to determine a facial organ deformation model from a pre-trained facial organ deformation model set based on facial organ information and deformation information selected by a user;
and an input unit configured to input the generated target image to the determined facial organ deformation model, and obtain a deformed image of the initial image.
8. The apparatus of claim 5, wherein the image generation model is to generate an countermeasure network.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
10. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-4.
CN201910199011.XA 2019-03-15 2019-03-15 Method and apparatus for generating image Active CN109949213B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910199011.XA CN109949213B (en) 2019-03-15 2019-03-15 Method and apparatus for generating image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910199011.XA CN109949213B (en) 2019-03-15 2019-03-15 Method and apparatus for generating image

Publications (2)

Publication Number Publication Date
CN109949213A CN109949213A (en) 2019-06-28
CN109949213B true CN109949213B (en) 2023-06-16

Family

ID=67010103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910199011.XA Active CN109949213B (en) 2019-03-15 2019-03-15 Method and apparatus for generating image

Country Status (1)

Country Link
CN (1) CN109949213B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110503703B (en) * 2019-08-27 2023-10-13 北京百度网讯科技有限公司 Method and apparatus for generating image
CN114331820A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016051302A (en) * 2014-08-29 2016-04-11 カシオ計算機株式会社 Image processor, imaging device, image processing method, and program
CN107909540A (en) * 2017-10-26 2018-04-13 深圳天珑无线科技有限公司 Image processing method, device, mobile terminal and computer-readable recording medium
CN108259769A (en) * 2018-03-30 2018-07-06 广东欧珀移动通信有限公司 Image processing method, device, storage medium and electronic equipment
CN108960093A (en) * 2018-06-21 2018-12-07 阿里体育有限公司 The recognition methods and equipment of face's rotational angle
CN109376596A (en) * 2018-09-14 2019-02-22 广州杰赛科技股份有限公司 Face matching process, device, equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009258224A (en) * 2008-04-14 2009-11-05 Canon Inc Image forming apparatus and image forming method
JP2015176272A (en) * 2014-03-14 2015-10-05 オムロン株式会社 Image processor, image processing method, and image processing program
CN108401112B (en) * 2018-04-23 2021-10-22 Oppo广东移动通信有限公司 Image processing method, device, terminal and storage medium
CN109215007B (en) * 2018-09-21 2022-04-12 维沃移动通信有限公司 Image generation method and terminal equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016051302A (en) * 2014-08-29 2016-04-11 カシオ計算機株式会社 Image processor, imaging device, image processing method, and program
CN107909540A (en) * 2017-10-26 2018-04-13 深圳天珑无线科技有限公司 Image processing method, device, mobile terminal and computer-readable recording medium
CN108259769A (en) * 2018-03-30 2018-07-06 广东欧珀移动通信有限公司 Image processing method, device, storage medium and electronic equipment
CN108960093A (en) * 2018-06-21 2018-12-07 阿里体育有限公司 The recognition methods and equipment of face's rotational angle
CN109376596A (en) * 2018-09-14 2019-02-22 广州杰赛科技股份有限公司 Face matching process, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
脸部特征定位方法;林维训,潘纲,吴朝晖,潘云鹤;中国图象图形学报(第08期);11 *

Also Published As

Publication number Publication date
CN109949213A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
CN107578017B (en) Method and apparatus for generating image
CN107633218B (en) Method and apparatus for generating image
CN108830235B (en) Method and apparatus for generating information
WO2020000879A1 (en) Image recognition method and apparatus
CN111476871B (en) Method and device for generating video
CN109993150B (en) Method and device for identifying age
CN107609506B (en) Method and apparatus for generating image
CN110827378A (en) Virtual image generation method, device, terminal and storage medium
CN109189544B (en) Method and device for generating dial plate
CN110298319B (en) Image synthesis method and device
CN109829432B (en) Method and apparatus for generating information
CN109981787B (en) Method and device for displaying information
CN110288705B (en) Method and device for generating three-dimensional model
CN112527115B (en) User image generation method, related device and computer program product
CN109977905B (en) Method and apparatus for processing fundus images
CN110046571B (en) Method and device for identifying age
JP2020013553A (en) Information generating method and apparatus applicable to terminal device
CN110837332A (en) Face image deformation method and device, electronic equipment and computer readable medium
CN109949213B (en) Method and apparatus for generating image
CN110570383A (en) image processing method and device, electronic equipment and storage medium
CN110288683B (en) Method and device for generating information
CN110008926B (en) Method and device for identifying age
CN111292333B (en) Method and apparatus for segmenting an image
CN109241930B (en) Method and apparatus for processing eyebrow image
CN111260756B (en) Method and device for transmitting information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231214

Address after: Building 3, No. 1 Yinzhu Road, Suzhou High tech Zone, Suzhou City, Jiangsu Province, 215011

Patentee after: Suzhou Moxing Times Technology Co.,Ltd.

Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Patentee before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right