WO2023125374A1 - Image processing method and apparatus, electronic device, and storage medium - Google Patents

Image processing method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2023125374A1
WO2023125374A1 PCT/CN2022/141815 CN2022141815W WO2023125374A1 WO 2023125374 A1 WO2023125374 A1 WO 2023125374A1 CN 2022141815 W CN2022141815 W CN 2022141815W WO 2023125374 A1 WO2023125374 A1 WO 2023125374A1
Authority
WO
WIPO (PCT)
Prior art keywords
style
image
target
model
trained
Prior art date
Application number
PCT/CN2022/141815
Other languages
French (fr)
Chinese (zh)
Inventor
白须
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023125374A1 publication Critical patent/WO2023125374A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • Embodiments of the present disclosure relate to the technical field of image processing, for example, to an image processing method, device, electronic equipment, and storage medium.
  • Image style transfer can be understood as rendering an image into an image with a specific artistic style.
  • Image style transfer images in the related art are mostly implemented by texture synthesis.
  • a style transfer model is trained to convert the image into a certain style based on the style transfer model.
  • the style transfer model in the related art cannot process images with different subject attributes, resulting in poor effect of the obtained style transfer image, which further affects the effect of user experience.
  • the present disclosure provides an image processing method, device, electronic equipment, and storage medium, so as to obtain a special effect image of a target style type and improve the display richness of image content.
  • an embodiment of the present disclosure provides an image processing method, the method including:
  • the target special effect image is displayed in the image display area.
  • an embodiment of the present disclosure further provides an image processing device, which includes:
  • the image acquisition module to be processed is configured to acquire the image to be processed including the target subject
  • the special effect image determination module is configured to input the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
  • the image display module is configured to display the target special effect image in the image display area.
  • an embodiment of the present disclosure further provides an electronic device, and the electronic device includes:
  • processors one or more processors
  • storage means configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the image processing method according to any one of the embodiments of the present disclosure.
  • the embodiments of the present disclosure further provide a storage medium containing computer-executable instructions, and the computer-executable instructions are used to execute any one of the image processing methods described in the embodiments of the present disclosure when executed by a computer processor.
  • FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure.
  • FIG. 3 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a target model to be trained provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
  • the disclosed technical solution can be applied to any scene of image style conversion, for example, converting a captured static image into an image of a certain theme style, the theme style can be Japanese style, Korean style, or designed by a designer Any theme style. Apply it in the scene of special effects video shooting, for example, convert a certain user in the captured screen or all users in the entire screen into a video with a certain theme style.
  • the style type may match the style of the makeup, and at the same time, the screen to which the user belongs may also be converted to match the style of the makeup.
  • users in each video frame to be processed can be displayed according to a corresponding style theme, or the entire video frame can be converted into a certain theme style.
  • Fig. 1 is a schematic flow chart of an image processing method provided by an embodiment of the present disclosure.
  • the embodiment of the present disclosure is applicable to the situation of converting an image frame into a target style type in any image display scene supported by the Internet.
  • the method can be executed by an image processing device, and the device can be realized in the form of software and/or hardware, for example, realized by electronic equipment, and the electronic equipment can be a mobile terminal, a PC terminal or a server, etc.
  • the scene of arbitrary image display is usually implemented by the cooperation of the client and the server.
  • the method provided in this embodiment can be executed by the server, the client, or the cooperation of the client and the server.
  • the method includes:
  • the device for executing the image processing method provided by the embodiments of the present disclosure may be integrated into application software supporting image processing functions, and the software may be installed in electronic equipment, for example, the electronic equipment may be a mobile terminal or a PC terminal, etc.
  • the application software may be a type of software for image/video processing, and its specific application software will not be described here one by one, as long as the image/video processing can be realized. It can also be a specially developed application program to realize the addition and display of special effects in the software, or it can be integrated in the corresponding page, and the user can realize the special effect addition process through the integrated page in the PC terminal.
  • the image to be processed may be an image collected based on the application software.
  • the image including the target subject may be captured in real time based on the application software.
  • the associated video frame or image processing is the style type consistent with the target style transfer type. It is also possible to set a corresponding special effect based on the style type, and after detecting that the user triggers the special effect, all the captured images may be converted into the corresponding style type.
  • the scene of image shooting or video shooting there may be multiple subjects in the captured image.
  • all users captured in the frame may be used as target subjects. It is also possible to mark which user or multiple users are the target subject before adding the special effect, and correspondingly, when the image to be processed is collected and determined to include the target subject in the image to be processed, it is processed.
  • the style theme conversion control can be triggered.
  • the image to be processed can be collected based on the camera device deployed on the terminal device.
  • the target subject may or may not be included in the image to be processed.
  • An image randomly obtained from a webpage may also be used as the image to be processed.
  • the image to be processed can be converted into a target special effect image consistent with the target style type.
  • the image to be processed is an image captured randomly or downloaded, and may also be an image captured in real time.
  • the target style conversion model is pre-trained and used to convert the image to be processed into a model of the corresponding style type.
  • the target style transfer model can be a GAN network based on the res structure.
  • the target special effect image is an image obtained after being processed by the target style transfer model.
  • the target subject in the target special effect image may be a target theme style, or an image obtained by converting the entire image to be processed into a corresponding target theme style type.
  • the image style output by the target style conversion model is consistent with the style type of the training samples used when training the target style conversion model. For example, if the style type of the style image in the training sample is style A, then the target style type is style A, and the target style conversion model outputs an image of style A; if the style type of the style image in the training sample is style B, then, The output of the target style transfer model is a B-style image. That is, the style type output by the target style transfer model is consistent with the style type used during model training.
  • the subject attribute can be the gender attribute or style type attribute of the target subject.
  • the gender attribute can be male or female
  • the style type attribute can be a preset style type.
  • the target style conversion model wants to achieve Output images of a certain style type under multiple style types, and you can define the style type and gender type corresponding to different tags in the alpha channel.
  • the image to be processed may be processed into an image corresponding to a corresponding style type according to the gender attribute and/or style attribute of the image to be processed.
  • the reason for defining the subject attributes is to avoid the need to train a model suitable for different genders in related technologies, which takes up memory, and realize that after the image to be processed is acquired, the subject attribute of the target subject in the image to be processed can be identified , and then input the subject attribute as information of the alpha channel into the target style transfer model, so as to perform style transfer processing on the image content of different gender attributes in different images to be processed based on a target style transfer model.
  • you want to implement multiple style types you need to train models of different style types, and there are cases where a model cannot implement multiple style type conversions.
  • the input of the model to be trained must not only include the image that needs to be converted to the image style type, but also need to edit the label value of the alpha channel, so that you can execute this The conversion model of the target special effect diagram of the technical scheme.
  • the subject attribute of the target object in the image to be processed may be determined based on the subject attribute identification module deployed in the terminal. Input the subject attributes and the image to be processed into the target style conversion model to obtain the target special effect image that converts the target subject into the target style type, or converts the entire image to be processed into the target special effect image of the target style type.
  • the target style transfer model before inputting the image to be processed into the target style transfer model, it further includes: scaling the image to be processed into an image of a certain size, and the size may be 384*384 pixels.
  • the above-mentioned processing method can be performed on the obtained samples.
  • the image display area can be understood as an area where the target special effect image is displayed.
  • the target image may be displayed on the display interface.
  • the image display area can be divided into two areas according to the principle of left and right, or the principle of up and down.
  • One area shows the target special effect image
  • the other area shows the original acquired image to be processed.
  • target video is to be formed, special effect processing can be performed on the image to be processed collected in real time to obtain the target special effect image.
  • a plurality of target special effect images are sequentially spliced to obtain a target video.
  • the target style type includes Japanese style, Korean style, ancient costume style, comic style, or at least one of multiple preset style types to be selected.
  • Period costume styles can include styles from any dynasty.
  • the subject attribute of the target subject in the image to be processed can be determined, and the subject attribute and the image to be processed can be used as the input of the pre-trained target style conversion model Parameters, to obtain the target special effect image that converts the target subject in the image to be processed into the target style type, and the image to be processed can be converted into the style type, thereby improving the user experience.
  • Fig. 2 is a schematic flow chart of an image processing method provided by another embodiment of the present disclosure.
  • a target style conversion model can be trained to obtain the target style conversion model.
  • Described method as shown in Figure 2 comprises:
  • the third original image including facial information may be collected in a real environment, or the third original image including facial information downloaded from a webpage, or may be a randomly generated facial image based on a facial image generation model.
  • the image to be used is an image after processing the style of the third original image, for example, it may be an image designed by a designer and corresponding to a certain style type.
  • the third-style image is an image of one or more styles that are hand-painted from the original image. Style types can be various. Various style types are used as the style types to be selected.
  • the image obtained after converting the third original image into a certain style is used as the image to be used. That is, the image to be used is an image of a certain style.
  • image cropping processing can be used.
  • the cropping process can be understood as aligning the nose tip and the center of the eyes as reference points, or aligning the center of the chin and the center of the eyes as reference points. Adjust the display ratio of the facial image in the display interface, thereby expanding the cropping range of the image, so that the entire head (including hair) and facial images are displayed on the display interface, and at the same time, images with corresponding background information are also included.
  • the image generation model to be trained may be a styleganv2 model, and the model parameters in the model are default values at this time.
  • the third output image is an image of a certain style randomly generated based on the image generation model to be trained. That is, the style type of the third output image is indeterminate.
  • Gaussian noise is randomly sampled Gaussian noise.
  • the image generation model to be trained processes the Gaussian noise to obtain third output images of different styles.
  • the image generation model to be trained is trained to obtain the target image generation model, which can generate sample data of different style types, so that a model capable of realizing conversion of different style types can be obtained based on the sample data training.
  • S230 Process the third output image and the third style image based on the first discriminator, determine a loss value, and correct model parameters in the image generation model to be trained based on the loss value.
  • the input of the first discriminator is the third output image and the third style image.
  • the first discriminator is configured to determine a loss value between the third output image and the third style image.
  • the model parameters in the image generation model to be trained can be corrected according to the loss value.
  • the target image generation model is an image generation model obtained through final training. Repeat the above steps based on multiple training samples until the convergence of the loss function is detected, and use the image generation model obtained in this case as the target image generation model.
  • S250 Process Gaussian noise based on the target image generation model to obtain the second style image.
  • the randomly sampled Gaussian noise can be processed based on the target image generation model, so as to obtain the second style image for training the first style transfer model.
  • the number of second style images can be as large as possible.
  • style types of the second style images may be the same or different.
  • the style types of the second style images can be as many and rich as possible.
  • the second style image can be processed on the basis of retaining the second style image.
  • the expression editing model may be a model for adding facial expressions to the target subject in the second style image.
  • the facial expression can be an open mouth, and the size of the mouth opening is different; expressions such as smiling and laughing, and the specific expressions are not specifically limited in this embodiment.
  • the second style image can be input into the expression editing model on the basis of retaining the second style image, so as to add expressions to users in the second style image, and obtain expression A second style image with a change in content. Based on the original second-style image and the second-style image with the expression added, training samples for training the first-style model can be obtained.
  • the target image generation model can be obtained based on the training, and the Gaussian noise can be processed to obtain the second style image under various style types; the expression can be added to the second style image based on the expression editing model to obtain the expression The second style image with changed content, so as to obtain the first style conversion model based on the second style image training, and then determine the training samples for training the target style conversion model based on the first style conversion model, so as to obtain the target style conversion model based on the training samples , to avoid the uneven quality of samples in related technologies, resulting in the inability to effectively perform style conversion for the original image, and to perform style conversion on the image to be processed, thereby improving the matching degree between the converted image and the user, Improved user experience.
  • Fig. 3 is a schematic flow chart of an image processing method provided by another embodiment of the present disclosure.
  • the second style image is obtained based on the target image generation model, it can be based on the second style image and the real environment
  • the second original image collected in the first style transfer model is trained to obtain the first style transfer model.
  • Described method as shown in Figure 3 comprises:
  • S310 Determine at least one second style image.
  • S320 Construct a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer.
  • target discriminator and target style comparer are pre-trained models.
  • the target model to be trained includes a style processing model to be trained, a target discriminator and a target style comparer.
  • the output of the style processing model to be trained is the input of the target discriminator and the target style comparer respectively, based on the output of the target discriminator and the target style comparer, the model parameters in the target model to be trained are corrected to obtain the target to be used Model.
  • the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, a feature fusion unit to be trained, and a compiler to be trained.
  • the style model to be trained is a GAN (Generative Adversarial Network, Generative Adversarial Network) model based on the starganv2 structure. This model is primarily set up to generate batches of unpaired data, i.e. second style images.
  • GAN Geneative Adversarial Network, Generative Adversarial Network
  • the input of the target model to be trained is two images, one image is the image collected in the real environment, that is, the second original image; the other image is an image of a certain style generated based on the image generation model.
  • the style types of the multiple second style images may be the same or different.
  • the style type corresponding to the second style image may be used as the style type to be selected.
  • the target model to be used is a model obtained through training based on the second style image and the second original image.
  • the specific training method of the target model to be trained based on the second original image and the second style image may be as follows: by combining and processing the second style image and the second original image, multiple second training samples are obtained; Wherein, the second training sample includes a second original image and a second style image; for the second training sample, the content splicing feature of the second original image is obtained based on the content feature extractor to be trained, and based on the style to be trained
  • the feature extractor obtains the style mosaic features of the second style image, performs fusion processing on the content mosaic features and style mosaic features based on the feature fusion model to be trained, obtains the fusion features, and inputs the fusion features into the compiler to be trained to obtain the actual output image; Input the actual output image and the second style image into the target discriminator to determine the first loss value; input the actual output image and the second style image into the target style comparison device to determine the style loss value; based on the first loss value and the style loss value, modify the model parameters in the style processing model to be trained
  • the second style image and the second original image may be randomly combined to obtain a plurality of second training samples.
  • Each training sample includes a second style image and a second original image.
  • the processing method for each second training sample is the same, and the processing of one of the training samples is taken as an example for introduction.
  • the target model to be trained For example, input the second original image and the second style image into the target model to be trained.
  • the content mosaic feature and style mosaic feature are fused to obtain the fusion feature.
  • Compile and process the fusion features based on the compiler to be trained to obtain the actual output image.
  • the ideal actual output image should include the image content of the second original image and the style features of the second style image.
  • the model parameters in the style processing model to be trained are default values, there are some differences between the obtained actual image and the ideally obtained image.
  • the target discriminator and target style in the target model to be trained Comparator for processing. Input the actual output image and the second style image corresponding to the actual output image into the target discriminator to obtain the first loss value; at the same time, input the actual output image and the second style image corresponding to the actual output image into For the target style comparator, determine the style loss value. Based on the first loss value and the style loss value, model parameters in the style processing model to be trained can be corrected. Take the convergence of the loss function in the style processing model to be trained as the training target, and obtain the target model to be used.
  • the second original image is A
  • the second style image is B.
  • the image content of the original image A is obtained based on the content feature extractor to be trained, and at the same time, the style features of the style image B are obtained based on the style feature extractor to be trained; the image content and style features are spliced based on the feature fusion model to be trained, and the actual Output the fused features corresponding to image C. Compile and process the fusion feature based on the compiler to be trained to obtain the actual output image C.
  • the model parameters in the style processing model to be trained in the target model to be trained are corrected until the loss function in the style processing model to be trained converges, and the target model to be used is obtained.
  • the target discriminator and the target style comparer in the target to-be-used model can be eliminated, that is, only the trained style processing model to be trained is retained, so as to obtain the to-be-used style model.
  • an original image and an image of a preferred style type can be randomly input into the style conversion model to be used to obtain a style consistent with a certain style type. image, at this point, the content in the image is consistent with the content in the original image.
  • S350 Determine a reference image of the target style type, and determine the first style conversion model based on the reference image and the style model to be used.
  • the target style type is the style type finally selected according to the preference of the user.
  • the reference image is an image consistent with the target style type.
  • the reference image can be bound with the style model to be used, so that after an original image is input, the style features of the reference image are extracted based on the style model to be used, and the image content and style features of the first original image are fused to obtain Images that match the style type.
  • the model after binding the reference image and the style model to be used is used as the first style conversion model.
  • the model structure in the first style conversion model is relatively complex, when it is deployed on a mobile terminal device, there may be insufficient computing power. Based on this, the first style conversion model can be deployed on the server , so that the server can perform style conversion processing on the image.
  • the technical solutions of the embodiments of the present disclosure can generate second style images of various styles based on the target image generation model, and perform training on the target model to be trained based on the second style image and the original image to obtain the target model to be used, and
  • the target model to be used and the pre-selected image of a certain style type are packaged and processed to obtain the first style conversion model.
  • the first style conversion model can convert the input original image into a target special effect image consistent with the packaged style type, so that various collected images can be processed, and the convenience of sample acquisition and image content processing is improved. sex.
  • Fig. 5 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure.
  • the first training sample can be constructed based on the first style conversion model , and then process the to-be-trained style conversion model based on the first training sample to obtain a target style conversion model.
  • a target style conversion model For an example implementation, please refer to the detailed description of the technical solution. Wherein, technical terms that are the same as or corresponding to those in the foregoing embodiments will not be repeated here.
  • the method includes:
  • the first style conversion model cannot be directly deployed on terminal devices due to its high requirements on computing power. Based on this, corresponding training samples can be constructed based on the first style conversion model. The trained target style transfer model can be deployed on the terminal device.
  • the first original image may be an image collected by a camera device in a real environment, or an image generated based on a certain image generation model. In order to improve the accuracy of model training, as many first original images as possible can be obtained.
  • the first original image may or may not have a corresponding style.
  • brightness changes may be performed on the first original image, for example, brightness adjustment may be performed on the entire image, or brightness changes may be performed only on the face image in the original image. This allows for a random brightness correction to make the lighting conditions of the trained mesh more random.
  • the face image pixels in the image can be extracted, and the brightness of the face pixels can be brightened.
  • the style type of the first style image is consistent with the target style type.
  • the first style image is generated based on the pre-trained first style transfer model.
  • the style type corresponding to the first style conversion model is bound to the pre-bound image style type, that is, according to the user's preference for multiple style images, one of the style types of images can be selected and the pre-trained target model Binding, so as to obtain the first style conversion model.
  • the first style conversion model can be deployed on the server, so that after receiving the image to be processed, the bound image and the image to be processed can be processed based on the first style conversion model to obtain the target special effect image of the target style type, and display on the client side.
  • the calculation amount of this model is very large, and it is not suitable for deployment on terminal devices. Therefore, corresponding training samples can be obtained based on the first style conversion model, and then the above-mentioned target style conversion model can be obtained by training based on the corresponding training samples.
  • the collected first original image is processed based on the pre-trained first style conversion model to obtain the first style image consistent with the style type of the first style conversion model. Based on this method, multiple training samples are obtained.
  • obtaining the corresponding first style image based on the first original image may be: performing content extraction on the current first original image based on the content feature extractor to obtain image content features; based on the style feature The extractor extracts style features from a preset reference style image that is consistent with the target style type to obtain image style features; based on the feature fuser, the image content features and the image style features are fused, Obtaining features to be compiled; obtaining a first style image corresponding to the current first original image based on the processing of the features to be compiled by the compiler.
  • the content of the first original image is obtained based on the content feature extractor
  • the style feature of the reference image is obtained based on the style feature extractor.
  • the image content and style features are fused to obtain the fused features.
  • the first original image may be subjected to style conversion processing to obtain a first style image of the first original image under the target style type. Based on the first original image and the corresponding first style image, a first training sample is determined.
  • the model parameters in the style transfer model to be trained are default values.
  • the object attribute may be the gender attribute of the target subject in the original image.
  • the style transfer model to be trained may be trained based on a plurality of first training samples to obtain a target style transfer model.
  • processing manner for each first training sample is the same, and the processing of one of the training samples may be used as an example for introduction.
  • the first training sample includes a corresponding original image and a first style image corresponding to the first original image.
  • the gender attribute of the target object in the first original image can be determined based on the corresponding algorithm, and the gender attribute and the first original image are input into the style conversion model to be trained, which will be compared with the second
  • the first style image corresponding to the original image is used as the output of the style conversion model to be trained, and the style conversion model to be trained is trained to obtain the target style conversion model.
  • the size of the image needs to be scaled to a certain size, so as to improve the efficiency of the model for image processing.
  • the label whose gender attribute is female is set to 0
  • the label whose gender attribute is male is set to 1
  • four channels are constructed to train a target style conversion model, which can be used for different gender attributes The image is processed.
  • the technical solution of the embodiment of the present disclosure can train the target style conversion model deployed on the terminal device, so that when the image to be processed is collected, the style type conversion can be performed on the image based on the client, which improves the convenience of image processing.
  • FIG. 6 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure.
  • the device includes: an image acquisition module 510 to be processed, a special effect image determination module 520 and an image display module 530 .
  • the to-be-processed image acquisition module 510 is set to acquire the to-be-processed image including the target subject;
  • the special effect image determination module 520 is set to input the subject attribute of the to-be-processed image and the target subject into the target style conversion model , to obtain the target special effect image converted from the target subject into the target style type;
  • the image display module 530 is configured to display the target special effect image in the image display area.
  • the device includes:
  • the first training sample acquisition module is configured to acquire a plurality of first training samples; wherein, the first training samples include a first original image and a first style image consistent with the target style type; the first The style image is generated by the first style transfer model;
  • the first training module is configured to use the first original image in the current first training sample and the object attribute of the object to be processed as the input of the style conversion model to be trained for the first training sample, and use the current first training sample
  • the first style image in the training sample is used as the output of the style model to be trained to obtain the target style conversion model through training; wherein the object attribute matches the subject attribute.
  • the first training sample acquisition module includes:
  • a first original image acquisition unit configured to acquire at least one first original image
  • the first style image acquisition unit is configured to stylize the current first original image based on the first style conversion model for the first original image, to obtain a first style image consistent with the target style type;
  • the first training sample acquisition unit is configured to determine the plurality of first training samples based on the first original image and the corresponding first style image.
  • the first style conversion model includes a style feature extractor, a content feature extractor, a feature fusion device and a compiler, and the first style image acquisition unit is set to:
  • the device includes:
  • the first model construction unit is configured to construct a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer; wherein, the target discriminator and the target style comparer are pre-trained
  • the model determination unit to be used is configured to train the target model to be trained according to at least one second style image of at least one style type to be selected and at least one second original image to obtain the target model to be used ; Wherein, the second style image is determined based on the target image generation model; the to-be-used style model determining unit is configured to use the to-be-trained style processing model trained in the target to-be-used model as the to-be-used style model;
  • the first style model determination unit is configured to determine a reference image corresponding to the target style type, and determine the first style conversion model based on the reference image and the style model to be used; wherein, the The target style type is one of the at least one style type to be selected.
  • the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, a feature fusion device to be trained, and a compiler to be trained, and the model determination unit to be used Set as:
  • a plurality of second training samples are obtained by randomly combining the second style image and the second original image; wherein, the second training samples include a second original image and a second style image; for the second training samples, obtaining the content splicing features of the second original image based on the content feature extractor to be trained, and obtaining the style splicing features of the second style image based on the style feature extractor to be trained, and obtaining the splicing features of the second style image based on the features to be trained
  • the fusion model fuses the content splicing feature and the style splicing feature to obtain the fusion feature, and inputs the fusion feature into the compiler to be trained to obtain an actual output image; combines the actual output image and the second Inputting the style image into the target discriminator to determine a first loss value, and inputting the actual output image and the second style image into the target style comparer to determine a style loss value; based on the The first loss value and the style loss value are used to modify the model parameters in the style processing model to be
  • the first style model determining unit is configured to determine a target style type from at least one style type to be selected; obtain a reference image consistent with the target style type, and The reference image is packaged with the style model to be used to obtain the first style conversion model.
  • the device also includes:
  • the third style image acquiring unit is configured to acquire a third original image in an image to be used under the style type to be selected, and cut the image to be used to obtain a third style image;
  • the third output image acquisition unit is configured to input Gaussian noise into the image generation model to be trained to obtain a third output image
  • a model parameter correction unit configured to process the third output image and the third style image based on the first discriminator, determine a loss value, and correct the model parameters in the image generation model to be trained based on the loss value ;
  • the target image generation model determination unit is configured to use the convergence of the loss function in the image generation model to be trained as the training target to obtain the target image generation model.
  • the device includes:
  • the second style image acquisition unit is configured to process Gaussian noise based on the target image generation model to obtain a second style image
  • the second style image updating unit is configured to add expressions to the second style image based on the expression editing model generated through pre-training, and update the second style image.
  • the target style type includes Japanese style, Korean style, ancient costume style, comic style or multiple preset style types to be selected.
  • the subject attribute of the target subject in the image to be processed can be determined, and the subject attribute and the image to be processed can be used as the input of the pre-trained target style conversion model Parameters, to obtain the target special effect image that converts the target subject in the image to be processed into the target style type, and the image to be processed can be converted into the style type, thereby improving the user experience.
  • the image processing device provided by the embodiment of the present disclosure can execute the image processing method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the terminal equipment in the embodiment of the present disclosure may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable multimedia player), vehicle terminal (such as mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers and the like.
  • the electronic device shown in FIG. 7 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • an electronic device 500 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 501, which may be randomly accessed according to a program stored in a read-only memory (ROM) 502 or loaded from a storage device 508.
  • ROM read-only memory
  • RAM random access memory
  • various appropriate actions and processes are executed by programs in the memory (RAM) 503 .
  • RAM random access memory
  • various programs and data necessary for the operation of the electronic device 500 are also stored.
  • the processing device 501, ROM 502, and RAM 503 are connected to each other through a bus 504.
  • An edit/output (I/O) interface 505 is also connected to the bus 504 .
  • the following devices can be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 507 such as a computer; a storage device 508 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 509.
  • the communication means 509 may allow the electronic device 500 to perform wireless or wired communication with other devices to exchange data. While FIG. 7 shows electronic device 500 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product, which includes a computer program carried on a non-transitory computer readable medium, where the computer program includes program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from a network via communication means 509, or from storage means 508, or from ROM 502.
  • the processing device 501 When the computer program is executed by the processing device 501, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are performed.
  • the electronic device provided by the embodiment of the present disclosure belongs to the same idea as the image processing method provided by the above embodiment, and the technical details not described in detail in this embodiment can be referred to the above embodiment, and this embodiment has the same benefits as the above embodiment Effect.
  • An embodiment of the present disclosure provides a computer storage medium, on which a computer program is stored, and when the program is executed by a processor, the image processing method provided in the foregoing embodiments is implemented.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
  • the client and the server can communicate using any currently known or future network protocols such as Hypertext Transfer Protocol (HyperText Transfer Protocol, HTTP), and can communicate with digital data in any form or medium
  • HTTP Hypertext Transfer Protocol
  • Examples of communication networks include local area networks ("LANs”), wide area networks ("WANs”), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device:
  • the target special effect image is displayed in the image display area.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs System on Chips
  • CPLD Complex Programmable Logical device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • Example 1 provides an image processing method, the method including:
  • the target special effect image is displayed in the image display area.
  • Example 2 provides an image processing method, the method further includes:
  • the first training samples include a first original image and a first style image consistent with the target style type; the first style image is generated by the first style conversion model of;
  • the first original image in the current first training sample and the object attribute of the object to be processed are used as the input of the style conversion model to be trained, and the first style in the current first training sample is An image, as the output of the style model to be trained, is trained to obtain the target style conversion model;
  • the object attribute matches the subject attribute.
  • Example 3 provides an image processing method, wherein,
  • the acquisition of multiple first training samples includes:
  • For the first original image stylize the current first original image based on the first style conversion model to obtain a first style image consistent with the target style type;
  • the plurality of first training samples is determined based on the first original image and the corresponding first style image.
  • Example 4 provides an image processing method, wherein the first style conversion model includes a style feature extractor, a content feature extractor, a feature fusion unit, and a compiler , the stylization processing of the current first original image based on the first style conversion model to obtain a first style image consistent with the target style type, including:
  • Example 5 provides an image processing method, the method further includes:
  • a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer; wherein, the target discriminator and the target style comparer are pre-trained;
  • the constructed target model to be trained is trained to obtain the target model to be used; wherein the second style image is determined based on the target image generation model;
  • style processing model to be trained trained in the target model to be used as the style model to be used;
  • the target style type is the at least one One of the style types to be selected.
  • Example 6 provides an image processing method, wherein the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, The training feature fusion device and the compiler to be trained, the target model to be trained is trained according to the multiple second style images of at least one style type to be selected, and at least one second original image, and the target to be trained model is obtained.
  • Use models including:
  • a plurality of second training samples are obtained by randomly combining the second style image and the second original image; wherein, the second training samples include a second original image and a second style image;
  • the content splicing feature of the second original image is obtained based on the content feature extractor to be trained, and the style splicing feature of the second style image is obtained based on the style feature extractor to be trained, based on the
  • the feature fusion model to be trained fuses the content splicing features and the style splicing features to obtain the fusion features, and inputs the fusion features into the compiler to be trained to obtain the actual output image;
  • Example 7 provides an image processing method, wherein the determining a reference image corresponding to the target style type is based on the reference image and the determined Describe the style model to be used, and determine the first style conversion model, including:
  • Example 8 provides an image processing method, the method further includes:
  • Gaussian noise is input in the image generation model to be trained, obtains the 3rd output image
  • Example 9 provides an image processing method, the method further includes:
  • Example 10 provides an image processing method, wherein the target style type includes Japanese style, Korean style, ancient costume style, comic style, or a variety of preset Select a style type.
  • Example Eleven provides an image processing device, which includes:
  • the image acquisition module to be processed is configured to acquire the image to be processed including the target subject
  • the special effect image determination module is configured to input the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
  • the image display module is configured to display the target special effect image in the image display area.
  • the technical solution of the embodiment of the present disclosure realizes the conversion of the image to be processed into an image with the target theme style, which improves the richness of the image display content, the novelty of the theme style, and the adaptability between the theme style and the user.

Abstract

An image processing method and apparatus, an electronic device, and a storage medium. The method comprises: acquiring an image to be processed comprising a target subject (S110); inputting the image to be processed and a subject attribute of the target subject into a target style conversion model to obtain a target special effect image in which the target subject has been converted into a target style type (S120); and displaying the target special effect image in an image display area (S130).

Description

图像处理方法、装置、电子设备及存储介质Image processing method, device, electronic device and storage medium
本申请要求在2021年12月29日提交中国专利局、申请号为202111641158.3的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims priority to a Chinese patent application with application number 202111641158.3 filed with the China Patent Office on December 29, 2021, the entire contents of which are incorporated herein by reference.
技术领域technical field
本公开实施例涉及图像处理技术领域,例如涉及一种图像处理方法、装置、电子设备及存储介质。Embodiments of the present disclosure relate to the technical field of image processing, for example, to an image processing method, device, electronic equipment, and storage medium.
背景技术Background technique
图像风格迁移可以理解为将一幅图像渲染为具有特定艺术风格的图像。相关技术中的图像风格迁移图像多是采用纹理合成的方式来实现的。或者是,训练得到一个风格迁移模型,以基于风格迁移模型将图像中转换为某种风格。Image style transfer can be understood as rendering an image into an image with a specific artistic style. Image style transfer images in the related art are mostly implemented by texture synthesis. Alternatively, a style transfer model is trained to convert the image into a certain style based on the style transfer model.
但是,在训练得到风格迁移模型时,需要获取大量的风格数据,在实际搜集过程中存在困难度较大的情况,因此,基于此种情况该训练得到的模型,无法得到较好的风格迁移效果。同时,相关技术中的风格迁移模型无法对不同主体属性的图像进行处理,导致得到的风格迁移图像效果不佳,进而影响用户使用体验的效果。However, when training the style transfer model, it is necessary to obtain a large amount of style data, which is difficult in the actual collection process. Therefore, based on this situation, the trained model cannot obtain a better style transfer effect . At the same time, the style transfer model in the related art cannot process images with different subject attributes, resulting in poor effect of the obtained style transfer image, which further affects the effect of user experience.
发明内容Contents of the invention
本公开提供一种图像处理方法、装置、电子设备及存储介质,以得到目标风格类型的特效图像,提高图像内容显示丰富性。The present disclosure provides an image processing method, device, electronic equipment, and storage medium, so as to obtain a special effect image of a target style type and improve the display richness of image content.
第一方面,本公开实施例提供了一种图像处理方法,该方法包括:In a first aspect, an embodiment of the present disclosure provides an image processing method, the method including:
获取包括目标主体的待处理图像;Obtain an image to be processed including a target subject;
将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像;Inputting the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
将所述目标特效图像于图像展示区域中显示。The target special effect image is displayed in the image display area.
第二方面,本公开实施例还提供了一种图像处理装置,该装置包括:In a second aspect, an embodiment of the present disclosure further provides an image processing device, which includes:
待处理图像采集模块,设置为获取包括目标主体的待处理图像;The image acquisition module to be processed is configured to acquire the image to be processed including the target subject;
特效图像确定模块,设置为将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像;The special effect image determination module is configured to input the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
图像显示模块,设置为将所述目标特效图像于图像展示区域中显示。The image display module is configured to display the target special effect image in the image display area.
第三方面,本公开实施例还提供了一种电子设备,所述电子设备包括:In a third aspect, an embodiment of the present disclosure further provides an electronic device, and the electronic device includes:
一个或多个处理器;one or more processors;
存储装置,设置为存储一个或多个程序,storage means configured to store one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本公开实施例任一所述图像处理方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the image processing method according to any one of the embodiments of the present disclosure.
第四方面,本公开实施例还提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行本公开实施例任一所述图像处理方法。In a fourth aspect, the embodiments of the present disclosure further provide a storage medium containing computer-executable instructions, and the computer-executable instructions are used to execute any one of the image processing methods described in the embodiments of the present disclosure when executed by a computer processor.
附图说明Description of drawings
贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic and that elements and elements are not necessarily drawn to scale.
图1为本公开一实施例所提供的一种图像处理方法流程示意图;FIG. 1 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure;
图2为本公开另一实施例所提供的一种图像处理方法流程示意图;FIG. 2 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure;
图3为本公开另一实施例所提供的一种图像处理方法流程示意图;FIG. 3 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure;
图4为本公开实施例所提供的目标待训练模型的结构示意图;FIG. 4 is a schematic structural diagram of a target model to be trained provided by an embodiment of the present disclosure;
图5为本公开另一实施例所提供的一种图像处理方法流程示意图;FIG. 5 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure;
图6为本公开实施例所提供的一种图像处理装置结构示意图;FIG. 6 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure;
图7为本公开实施例所提供的一种电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
应当理解,本公开的方法实施方式中记载的多个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。It should be understood that multiple steps described in the method implementations of the present disclosure may be executed in different orders, and/or executed in parallel. Additionally, method embodiments may include additional steps and/or omit performing illustrated steps. The scope of the present disclosure is not limited in this respect.
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。As used herein, the term "comprise" and its variations are open-ended, ie "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one further embodiment"; the term "some embodiments" means "at least some embodiments." Relevant definitions of other terms will be given in the description below.
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。It should be noted that concepts such as "first" and "second" mentioned in this disclosure are only used to distinguish different devices, modules or units, and are not used to limit the sequence of functions performed by these devices, modules or units or interdependence.
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。It should be noted that the modifications of "one" and "multiple" mentioned in the present disclosure are illustrative and not restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, it should be understood as "one or more" multiple".
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are used for illustrative purposes only, and are not used to limit the scope of these messages or information.
在介绍本技术方案之前,可以先对应用场景进行示例性说明。可以将本公开技术方案应用在任意图像风格转换的场景中,例如,将拍摄的静态图像转换为某种主题风格的图像,主题风格可以是日系风格、韩系风格,也可以是设计师设计的任意一种主题风格。将其应用在特效视频拍摄的场景中,如,将入镜画面中某个用户或者整个画面中的所有用户,都转换为某种主题风格的视频。风格类型可以是与妆容的风格相匹配,同时,可以将用户所属画面也转换为与妆容风格相匹配。Before introducing the technical solution, an example description may be given to the application scenario. The disclosed technical solution can be applied to any scene of image style conversion, for example, converting a captured static image into an image of a certain theme style, the theme style can be Japanese style, Korean style, or designed by a designer Any theme style. Apply it in the scene of special effects video shooting, for example, convert a certain user in the captured screen or all users in the entire screen into a video with a certain theme style. The style type may match the style of the makeup, and at the same time, the screen to which the user belongs may also be converted to match the style of the makeup.
在本实施例中,可以将每个待处理视频帧中的用户按照相应的风格主题进行展示,还可以是,将整个视频帧均转换为某种主题风格。In this embodiment, users in each video frame to be processed can be displayed according to a corresponding style theme, or the entire video frame can be converted into a certain theme style.
图1为本公开一实施例所提供的一种图像处理方法流程示意图,本公开实施例适用于在互联网所支持的任意图像展示场景中,用于将图像画面转换为目标风格类型的情形,该方法可以由图像处理装置来执行,该装置可以通过软件和/或硬件的形式实现,例如,通过电子设备来实现,该电子设备可以是移动终端、PC端或服务器等。任意图像展示的场景通常是由客户端和服务器来配合实现的,本实施例所提供的方法可以由服务端来执行,客户端来执行,或者是客户端和服务端的配合来执行。Fig. 1 is a schematic flow chart of an image processing method provided by an embodiment of the present disclosure. The embodiment of the present disclosure is applicable to the situation of converting an image frame into a target style type in any image display scene supported by the Internet. The method can be executed by an image processing device, and the device can be realized in the form of software and/or hardware, for example, realized by electronic equipment, and the electronic equipment can be a mobile terminal, a PC terminal or a server, etc. The scene of arbitrary image display is usually implemented by the cooperation of the client and the server. The method provided in this embodiment can be executed by the server, the client, or the cooperation of the client and the server.
如图1所示,所述方法包括:As shown in Figure 1, the method includes:
S110、获取包括目标主体的待处理图像。S110. Acquire an image to be processed including a target subject.
其中,执行本公开实施例提供的图像处理方法的装置,可以集成在支持图像处理功能的应用软件中,且该软件可以安装至电子设备中,例如,电子设备可以是移动终端或者PC端等。应用软件可以是对图像/视频处理的一类软件,其具体的应用软件在此不再一一赘述,只要可以实现图像/视频处理即可。还可以是专门研发的应用程序,来实现添加特效并特效展示 的软件中,亦或是集成在相应的页面中,用户可以通过PC端中集成的页面来实现特效添加处理。Wherein, the device for executing the image processing method provided by the embodiments of the present disclosure may be integrated into application software supporting image processing functions, and the software may be installed in electronic equipment, for example, the electronic equipment may be a mobile terminal or a PC terminal, etc. The application software may be a type of software for image/video processing, and its specific application software will not be described here one by one, as long as the image/video processing can be realized. It can also be a specially developed application program to realize the addition and display of special effects in the software, or it can be integrated in the corresponding page, and the user can realize the special effect addition process through the integrated page in the PC terminal.
其中,待处理图像可以是基于应用软件采集的图像,在实际应用中,可以基于应用软件实时拍摄包括目标主体的图像,此时可以对目标主体所属图像进行风格类型转换处理,如,将目标主体所属的视频帧或者图像处理为与目标风格转换类型相一致的风格类型。也可以是基于该风格类型设置相应的特效,在检测到用户触发该特效后,可以将采集的画面均转换为相应的风格类型。Wherein, the image to be processed may be an image collected based on the application software. In practical applications, the image including the target subject may be captured in real time based on the application software. The associated video frame or image processing is the style type consistent with the target style transfer type. It is also possible to set a corresponding special effect based on the style type, and after detecting that the user triggers the special effect, all the captured images may be converted into the corresponding style type.
在图像拍摄或者视频拍摄的场景中,入镜画面中的主体可以有多个,例如,在人流密度较高的场景下,可以将入镜的所有用户作为目标主体。也可以在添加特效之前,标记哪一个用户或多个用户为目标主体,相应的,在采集到待处理图像,并确定待处理图像中包括目标主体时,对其进行处理。In the scene of image shooting or video shooting, there may be multiple subjects in the captured image. For example, in a scene with high traffic density, all users captured in the frame may be used as target subjects. It is also possible to mark which user or multiple users are the target subject before adding the special effect, and correspondingly, when the image to be processed is collected and determined to include the target subject in the image to be processed, it is processed.
可以理解为:若需要生成与目标风格类型相对应的目标视频,则可以触发风格主题转换控件。同时,可以基于终端设备上部署的摄像装置采集待处理图像。待处理图像中可以包括目标主体,也可以不包括目标主体。也可以将从网页上随机获取的一幅图像作为待处理图像。可以将待处理图像转换为与目标风格类型相一致的目标特效图像。It can be understood as: if a target video corresponding to the target style type needs to be generated, the style theme conversion control can be triggered. At the same time, the image to be processed can be collected based on the camera device deployed on the terminal device. The target subject may or may not be included in the image to be processed. An image randomly obtained from a webpage may also be used as the image to be processed. The image to be processed can be converted into a target special effect image consistent with the target style type.
S120、将待处理图像和目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像。S120. Input the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into the target style type.
其中,待处理图像为随机拍摄到的或者下载的图像,也可以是实时拍摄的图像。目标风格转换模型为预先训练得到,用于将待处理图像转换为相应风格类型的模型。目标风格转换模型可以是以res结构为基础的GAN网络。目标特效图像为经目标风格转换模型处理后,所得到的图像。目标特效图像中的目标主体可以为目标主题风格,也可以是将整幅待处理图像转换为相应目标主题风格类型后的图像。Wherein, the image to be processed is an image captured randomly or downloaded, and may also be an image captured in real time. The target style conversion model is pre-trained and used to convert the image to be processed into a model of the corresponding style type. The target style transfer model can be a GAN network based on the res structure. The target special effect image is an image obtained after being processed by the target style transfer model. The target subject in the target special effect image may be a target theme style, or an image obtained by converting the entire image to be processed into a corresponding target theme style type.
还需要说明的是,目标风格转换模型输出的图像风格,与训练得到目标风格转换模型时,采用的训练样本的风格类型相一致。例如,训练样本中风格图像的风格类型为A风格,那么,目标风格类型为A风格,目标风格转换模型输出的为A风格的图像;如果训练样本中风格图像的风格类型为B风格,那么,目标风格转换模型输出的为B风格的图像。即,目标风格转换模型输出的风格类型与模型训练时采用的风格类型相一致。It should also be noted that the image style output by the target style conversion model is consistent with the style type of the training samples used when training the target style conversion model. For example, if the style type of the style image in the training sample is style A, then the target style type is style A, and the target style conversion model outputs an image of style A; if the style type of the style image in the training sample is style B, then, The output of the target style transfer model is a B-style image. That is, the style type output by the target style transfer model is consistent with the style type used during model training.
其中,主体属性可以是目标主体的性别属性或者风格类型属性,例如,性别属性可以是为男或女,风格类型属性可以是预先设置的某种风格类型,例如,目标风格转换模型想要实现能够输出多种风格类型下某种风格类型的图像,可以在alpha通道中定义不同标签所对应的风格类型以及性别类型。在获取到待处理图像后,可以根据待处理图像的性别属性和/或风格属性,将待处理图像处理为相应风格类型下所对应的图像。定义主体属性的原因在于:避免了相关技术中需要分别训练一个适用于不同性别的模型,存在占用内存的情况,实现了在获取到待处理图像后,可以识别待处理图像中目标主体的主体属性,进而将主体属性作为alpha通道的信息输入至目标风格转换模型中,从而基于一个目标风格转换模型对不同待处理图像中不同性别属性的图像内容,进行风格转换处理。同时,想要实现多种风格类型时,需要训练不同风格类型的模型,存在一个模型无法实现多种风格类型转换的情况。Among them, the subject attribute can be the gender attribute or style type attribute of the target subject. For example, the gender attribute can be male or female, and the style type attribute can be a preset style type. For example, the target style conversion model wants to achieve Output images of a certain style type under multiple style types, and you can define the style type and gender type corresponding to different tags in the alpha channel. After the image to be processed is acquired, the image to be processed may be processed into an image corresponding to a corresponding style type according to the gender attribute and/or style attribute of the image to be processed. The reason for defining the subject attributes is to avoid the need to train a model suitable for different genders in related technologies, which takes up memory, and realize that after the image to be processed is acquired, the subject attribute of the target subject in the image to be processed can be identified , and then input the subject attribute as information of the alpha channel into the target style transfer model, so as to perform style transfer processing on the image content of different gender attributes in different images to be processed based on a target style transfer model. At the same time, when you want to implement multiple style types, you need to train models of different style types, and there are cases where a model cannot implement multiple style type conversions.
还需要说明的,若要转换为某种风格类型,则可以将多种风格类型显示在显示界面上以备用户选择,根据用户的选择确定alpha通道中的标签信息,从而实现基于一个目标风格转换模型就可以得到相应风格类型下的目标特效图像。It should also be explained that if you want to convert to a certain style type, you can display multiple style types on the display interface for user selection, and determine the label information in the alpha channel according to the user's selection, so as to achieve style conversion based on a target The model can get the target special effect image under the corresponding style type.
还需要说明的是,如果想要得到上述效果,在进行模型训练时,待训练模型的输入不仅要包括需要进行图像风格类型转换的图像,还需要编辑alpha通道的标签值,从而得到可以 执行本技术方案的目标特效图转换模型。It should also be noted that if you want to obtain the above effects, when performing model training, the input of the model to be trained must not only include the image that needs to be converted to the image style type, but also need to edit the label value of the alpha channel, so that you can execute this The conversion model of the target special effect diagram of the technical scheme.
例如,在采集到待处理图像后,可以基于终端内部署的主体属性识别模块,确定待处理图像中目标对象的主体属性。将主体属性和待处理图像共同输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像,或者将待处理图像整体转换为目标风格类型的目标特效图像。For example, after the image to be processed is collected, the subject attribute of the target object in the image to be processed may be determined based on the subject attribute identification module deployed in the terminal. Input the subject attributes and the image to be processed into the target style conversion model to obtain the target special effect image that converts the target subject into the target style type, or converts the entire image to be processed into the target special effect image of the target style type.
在上述技术方案的基础上,在将待处理图像输入至目标风格转换模型之前,还包括:将待处理图像放缩为一定尺寸的图像,该尺寸可以是384*384像素点。相应的,在进行模型训练时,可以对获取到的样本执行上述处理方式。将本技术方案部署在终端设备上对图像进行处理时,可以快速对待处理图像进行响应,从而得到相应的目标特效图像,即提高了图像处理效率。On the basis of the above technical solution, before inputting the image to be processed into the target style transfer model, it further includes: scaling the image to be processed into an image of a certain size, and the size may be 384*384 pixels. Correspondingly, during model training, the above-mentioned processing method can be performed on the obtained samples. When the technical solution is deployed on the terminal device to process the image, it can quickly respond to the image to be processed, so as to obtain the corresponding target special effect image, which improves the image processing efficiency.
S130、将目标特效图像于图像展示区域中显示。S130. Display the target special effect image in the image display area.
其中,图像展示区域可以理解为显示目标特效图像的区域。Wherein, the image display area can be understood as an area where the target special effect image is displayed.
例如,在得到目标特效图像后,可以将目标图像在显示界面中进行展示。For example, after the target special effect image is obtained, the target image may be displayed on the display interface.
相应的,为了提高图像展示的趣味性和比对性,可以将图像展示区域依据左右原则,或者上下原则划分为两个区域。其中一个区域中显示目标特效图像,另一个区域显示原始采集的待处理图像。当用户触发分屏显示的控件或者触发分屏显示的指令时,可以按照上述方式进行显示。Correspondingly, in order to improve the interest and comparison of image display, the image display area can be divided into two areas according to the principle of left and right, or the principle of up and down. One area shows the target special effect image, and the other area shows the original acquired image to be processed. When the user triggers the control for split-screen display or triggers the command for split-screen display, it can be displayed in the above manner.
还需要说明的,如果要形成目标视频,则可以对实时采集的待处理图像进行特效处理,得到目标特效图像。依次对多个目标特效图像拼接处理,得到目标视频。It should also be noted that if the target video is to be formed, special effect processing can be performed on the image to be processed collected in real time to obtain the target special effect image. A plurality of target special effect images are sequentially spliced to obtain a target video.
在本实施例中,目标风格类型包括日系风格、韩系风格、古装风格、漫画风格或者预先设置的多种待选择风格类型中的至少一种。古装风格可以包括任何朝代的风格。In this embodiment, the target style type includes Japanese style, Korean style, ancient costume style, comic style, or at least one of multiple preset style types to be selected. Period costume styles can include styles from any dynasty.
本公开实施例的技术方案,在获取到包括目标主体的待处理图像时,可以确定待处理图像中目标主体的主体属性,将主体属性和待处理图像作为预先训练好的目标风格转换模型的输入参数,得到将待处理图像中目标主体转换为目标风格类型的目标特效图像,可以将待处理图像进行风格类型转换,从而提高用户使用体验。According to the technical solution of the embodiment of the present disclosure, when the image to be processed including the target subject is acquired, the subject attribute of the target subject in the image to be processed can be determined, and the subject attribute and the image to be processed can be used as the input of the pre-trained target style conversion model Parameters, to obtain the target special effect image that converts the target subject in the image to be processed into the target style type, and the image to be processed can be converted into the style type, thereby improving the user experience.
图2为本公开另一实施例所提供的一种图像处理方法流程示意图,在前述实施例的基础,先介绍如何训练得到目标图像生成模型,以基于目标图像生成模型确定第二训练样本,进而训练得到第一风格转换模型,在得到第一风格转换模型的基础上,可以训练得到目标风格转换模型,其示例的实施方式可以参见本技术方案的详细阐述。其中,与上述实施例相同或者相应的技术术语在此不再赘述。Fig. 2 is a schematic flow chart of an image processing method provided by another embodiment of the present disclosure. On the basis of the foregoing embodiments, firstly introduce how to train the target image generation model to determine the second training sample based on the target image generation model, and then The first style conversion model is obtained through training. On the basis of obtaining the first style conversion model, a target style conversion model can be trained to obtain the target style conversion model. Refer to the detailed description of this technical solution for an example implementation manner. Wherein, technical terms that are the same as or corresponding to those in the foregoing embodiments will not be repeated here.
如图2所示所述方法包括:Described method as shown in Figure 2 comprises:
S210、获取第三原始图像于待选择风格类型下的待使用图像,对所述待使用图像裁剪处理,得到第三风格图像。S210. Acquire a to-be-used image of the third original image under the to-be-selected style type, and crop the to-be-used image to obtain a third style image.
其中,可以在真实环境中采集包括面部信息的第三原始图像,或者是,从网页上下载的包括面部信息的第三原始图像,还可以是基于面部图像生成模型随机生成的面部图像。待使用图像为对第三原始图像风格处理后的图像,例如,可以是基于设计师设计的与某种风格类型相对应的图像。第三风格图像对原始图像手绘的某种或者某几种风格类型下的图像。风格类型可以是各种各样的。将各种各样的风格类型作为待选择风格类型。将第三原始图像转换为某种风格类型后所得到的图像作为待使用图像。即,待使用图像为某种风格类型下的图像。Wherein, the third original image including facial information may be collected in a real environment, or the third original image including facial information downloaded from a webpage, or may be a randomly generated facial image based on a facial image generation model. The image to be used is an image after processing the style of the third original image, for example, it may be an image designed by a designer and corresponding to a certain style type. The third-style image is an image of one or more styles that are hand-painted from the original image. Style types can be various. Various style types are used as the style types to be selected. The image obtained after converting the third original image into a certain style is used as the image to be used. That is, the image to be used is an image of a certain style.
相关技术中的图像多是仅包括面部信息,导致训练得到的模型仅可以对面部图像风格处理,因此存在图像真实度较低的情况。基于此,可以对待使用图像裁剪处理。裁剪处理可以理解为,将鼻尖以及眼睛中心点为基准点进行对齐,或者,将下巴中心点和眼睛中心点为基 准点进行对齐。调整面部图像在显示界面中的展示比例,从而扩大图片裁剪范围,得到整个头部(包括头发)和面部图像均在显示界面上,同时,还包括有相应的背景信息的图像。Most of the images in the related art only include facial information, so that the trained model can only process the style of the facial image, so there is a situation that the authenticity of the image is low. Based on this, image cropping processing can be used. The cropping process can be understood as aligning the nose tip and the center of the eyes as reference points, or aligning the center of the chin and the center of the eyes as reference points. Adjust the display ratio of the facial image in the display interface, thereby expanding the cropping range of the image, so that the entire head (including hair) and facial images are displayed on the display interface, and at the same time, images with corresponding background information are also included.
需要说明的是,获取有限数量的第三原始图像和相应的待使用图像,进而基于有限的第三原书图像和相应的待使用图像对待训练图像生成模型进行训练,得到目标图像生成模型。It should be noted that a limited number of third original images and corresponding images to be used are obtained, and then the image generation model to be trained is trained based on the limited third original image and corresponding images to be used to obtain a target image generation model.
S220、将高斯噪声输入待训练图像生成模型中,得到第三输出图像。S220. Input Gaussian noise into the image generation model to be trained to obtain a third output image.
其中,待训练图像生成模型可以是styleganv2模型,此时模型中的模型参数为默认值。第三输出图像是基于待训练图像生成模型随机生成的某种风格的图像。即第三输出图像的风格类型是不定的。高斯噪声为随机采样的高斯噪声。待训练图像生成模型对该高斯噪声处理,可以得到不同风格类型的第三输出图像。Wherein, the image generation model to be trained may be a styleganv2 model, and the model parameters in the model are default values at this time. The third output image is an image of a certain style randomly generated based on the image generation model to be trained. That is, the style type of the third output image is indeterminate. Gaussian noise is randomly sampled Gaussian noise. The image generation model to be trained processes the Gaussian noise to obtain third output images of different styles.
在本实施例中,对待训练图像生成模型进行训练,以得到目标图像生成模型,可以生成不同风格类型的样本数据,从而基于此样本数据训练得到可以实现不同风格类型转换的模型。In this embodiment, the image generation model to be trained is trained to obtain the target image generation model, which can generate sample data of different style types, so that a model capable of realizing conversion of different style types can be obtained based on the sample data training.
S230、基于第一判别器对所述第三输出图像和第三风格图像处理,确定损失值,并基于所述损失值对所述待训练图像生成模型中的模型参数进行修正。S230. Process the third output image and the third style image based on the first discriminator, determine a loss value, and correct model parameters in the image generation model to be trained based on the loss value.
其中,第一判别器的输入为第三输出图像和第三风格图像。第一判别器设置为确定第三输出图像和第三风格图像之间的损失值。根据损失值可以修正待训练图像生成模型中的模型参数。Wherein, the input of the first discriminator is the third output image and the third style image. The first discriminator is configured to determine a loss value between the third output image and the third style image. The model parameters in the image generation model to be trained can be corrected according to the loss value.
S240、将所述待训练图像生成模型中的损失函数收敛作为训练目标,得到所述目标图像生成模型。S240. Taking the convergence of the loss function in the image generation model to be trained as a training target to obtain the target image generation model.
例如,目标图像生成模型为最终训练得到的图像生成模型。基于多个训练样本重复执行上述步骤,直至检测到损失函数收敛,将此种情形下得到的图像生成模型,作为目标图像生成模型。For example, the target image generation model is an image generation model obtained through final training. Repeat the above steps based on multiple training samples until the convergence of the loss function is detected, and use the image generation model obtained in this case as the target image generation model.
S250、基于所述目标图像生成模型对高斯噪声进行处理,得到所述第二风格图像。S250. Process Gaussian noise based on the target image generation model to obtain the second style image.
例如,基于目标图像生成模型可以对随机采样的高斯噪声进行处理,从而得到训练第一风格转换模型的第二风格图像。第二风格图像的数量可以尽可能多。For example, the randomly sampled Gaussian noise can be processed based on the target image generation model, so as to obtain the second style image for training the first style transfer model. The number of second style images can be as large as possible.
需要说明的是,第二风格图像的风格类型可以相同,也可以不同。当然,为了使训练得到的目标模型可以生成不同风格类型的图像,第二风格图像的风格类型可以尽可能多而丰富。It should be noted that the style types of the second style images may be the same or different. Of course, in order to enable the trained target model to generate images of different style types, the style types of the second style images can be as many and rich as possible.
S260、基于预先训练生成的表情编辑模型,为所述第二风格图像进行表情添加,更新第二风格图像。S260. Add an expression to the second style image based on the expression editing model generated through pre-training, and update the second style image.
为了提高训练第一风格转换模型的样本丰富性,可以在保留第二风格图像的基础上,对第二风格图像进行处理。表情编辑模型可以是为第二风格图像中目标主体添加面部表情的模型。面部表情可以是张开嘴巴,嘴巴张开的大小是不同的;微笑、大笑等表情,其具体的表情在本实施例中不做具体限定。In order to improve the richness of samples for training the first style transfer model, the second style image can be processed on the basis of retaining the second style image. The expression editing model may be a model for adding facial expressions to the target subject in the second style image. The facial expression can be an open mouth, and the size of the mouth opening is different; expressions such as smiling and laughing, and the specific expressions are not specifically limited in this embodiment.
可以理解的是:在得到第二风格图像后,可以在保留第二风格图像的基础上,将第二风格图像输入至表情编辑模型中,以为第二风格图像中的用户进行表情添加,得到表情内容有所变化的第二风格图像。基于原有的第二风格图像和添加表情后的第二风格图像,可以得到训练第一风格模型的训练样本。It can be understood that: after obtaining the second style image, the second style image can be input into the expression editing model on the basis of retaining the second style image, so as to add expressions to users in the second style image, and obtain expression A second style image with a change in content. Based on the original second-style image and the second-style image with the expression added, training samples for training the first-style model can be obtained.
本公开实施例的技术方案,可以基于训练得到目标图像生成模型,对高斯噪声进行处理,得到多种风格类型下的第二风格图像;基于表情编辑模型对第二风格图像进行表情添加,得到表情内容有所变化的第二风格图像,从而基于第二风格图像训练得到第一风格转换模型,进而基于第一风格转换模型确定训练目标风格转换模型的训练样本,以基于训练样本得到目标风格转换模型,避免了相关技术中样本数量的质量参差不齐,导致无法有效的为原始图像进行风格转换的情况,可以将待处理图像进行风格转换,从而提高转换得到的图像与用户之 间的匹配度,提高了用户体验。In the technical solution of the embodiment of the present disclosure, the target image generation model can be obtained based on the training, and the Gaussian noise can be processed to obtain the second style image under various style types; the expression can be added to the second style image based on the expression editing model to obtain the expression The second style image with changed content, so as to obtain the first style conversion model based on the second style image training, and then determine the training samples for training the target style conversion model based on the first style conversion model, so as to obtain the target style conversion model based on the training samples , to avoid the uneven quality of samples in related technologies, resulting in the inability to effectively perform style conversion for the original image, and to perform style conversion on the image to be processed, thereby improving the matching degree between the converted image and the user, Improved user experience.
图3为本公开另一实施例所提供的一种图像处理方法流程示意图,在前述实施例的基础上,在基于目标图像生成模型得到第二风格图像后,可以基于第二风格图像和真实环境中采集的第二原始图像,训练得到第一风格转换模型,其示例的实施方式可以参见本技术方案的详细阐述。其中,与上述实施例相同或者相应的技术术语在此不再赘述。Fig. 3 is a schematic flow chart of an image processing method provided by another embodiment of the present disclosure. On the basis of the foregoing embodiments, after the second style image is obtained based on the target image generation model, it can be based on the second style image and the real environment The second original image collected in the first style transfer model is trained to obtain the first style transfer model. For an example implementation, please refer to the detailed description of the technical solution. Wherein, technical terms that are the same as or corresponding to those in the foregoing embodiments will not be repeated here.
如图3所示所述方法包括:Described method as shown in Figure 3 comprises:
S310、确定至少一幅第二风格图像。S310. Determine at least one second style image.
可以基于前述实施例确定多种风格类型的第二风格图像。Multiple style types of second style images can be determined based on the foregoing embodiments.
S320、构建包括待训练风格处理模型、目标判别器以及目标风格比对器的目标待训练模型。S320. Construct a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer.
其中,在训练得到第一风格转换模型之前,需要先构建出目标待训练模型,并对构建出的目标待训练模型进行训练,得到目标待使用模型。通过对目标模型的处理,得到第一风格转换模型。Wherein, before training to obtain the first style conversion model, it is necessary to construct a target model to be trained first, and train the constructed target model to be trained to obtain a target model to be used. By processing the target model, a first style transfer model is obtained.
还需要说明的是,目标判别器以及目标风格比对器为预先训练得到的模型。It should also be noted that the target discriminator and target style comparer are pre-trained models.
为了清楚的了解目标待训练模型以及待训练风格处理模型的模型结构,可以结合图4来了解。参见图4,目标待训练模型中包括待训练风格处理模型、目标判别器以及目标风格比对器。待训练风格处理模型的输出分别为目标判别器和目标风格比对器的输入,基于目标判别器和目标风格比对器的输出对目标待训练模型中的模型参数进行修正,以得到目标待使用模型。继续参见图4,待训练风格处理模型中包括:待训练风格特征提取器、待训练内容特征提取器、待训练特征融合器以及待训练编译器。待训练风格模型是以starganv2结构为基础的GAN(Generative Adversarial Network,生成对抗网络)模型。此模型主要设置为批量生成非配对数据,即第二风格图像。In order to clearly understand the model structure of the target model to be trained and the style processing model to be trained, it can be understood in conjunction with Figure 4. Referring to Fig. 4, the target model to be trained includes a style processing model to be trained, a target discriminator and a target style comparer. The output of the style processing model to be trained is the input of the target discriminator and the target style comparer respectively, based on the output of the target discriminator and the target style comparer, the model parameters in the target model to be trained are corrected to obtain the target to be used Model. Continuing to refer to FIG. 4 , the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, a feature fusion unit to be trained, and a compiler to be trained. The style model to be trained is a GAN (Generative Adversarial Network, Generative Adversarial Network) model based on the starganv2 structure. This model is primarily set up to generate batches of unpaired data, i.e. second style images.
S330、根据至少一种待选择风格类型的多幅第二风格图像,以及至少一幅第二原始图像,对构建的目标待训练模型进行训练,得到目标待使用模型。S330. Train the constructed target model to be trained according to multiple second style images of at least one style type to be selected and at least one second original image to obtain a target model to be used.
需要说明的是,目标待训练模型的输入为两幅图像,一幅图像为真实环境中采集的图像,即第二原始图像;另一幅图像为基于图像生成模型生成的某种风格的图像。多幅第二风格图像的风格类型可以相同可以不同。可以将第二风格图像对应的风格类型作为待选择风格类型。目标待使用模型为基于第二风格图像和第二原始图像,训练得到的模型。It should be noted that the input of the target model to be trained is two images, one image is the image collected in the real environment, that is, the second original image; the other image is an image of a certain style generated based on the image generation model. The style types of the multiple second style images may be the same or different. The style type corresponding to the second style image may be used as the style type to be selected. The target model to be used is a model obtained through training based on the second style image and the second original image.
在本实施例中,基于第二原始图像和第二风格图像对目标待训练模型的具体训练方式可以是:通过对第二风格图像和第二原始图像组合处理,得到多个第二训练样本;其中,第二训练样本中包括一幅第二原始图像和一幅第二风格图像;针对第二训练样本,基于待训练内容特征提取器获取第二原始图像的内容拼接特征,以及基于待训练风格特征提取器获取第二风格图像的风格拼接特征,基于待训练特征融合模型对内容拼接特征和风格拼接特征融合处理,得到融合特征,并将融合特征输入待训练编译器中,得到实际输出图像;将实际输出图像和第二风格图像输入至目标判别器中,确定第一损失值;将实际输出图像和第二风格图像输入至目标风格比对器中,确定风格损失值;基于第一损失值和风格损失值,对目标待训练模型中待训练风格处理模型中的模型参数进行修正,并将待训练风格处理模型中的损失函数收敛作为训练目标,得到目标待使用模型。In this embodiment, the specific training method of the target model to be trained based on the second original image and the second style image may be as follows: by combining and processing the second style image and the second original image, multiple second training samples are obtained; Wherein, the second training sample includes a second original image and a second style image; for the second training sample, the content splicing feature of the second original image is obtained based on the content feature extractor to be trained, and based on the style to be trained The feature extractor obtains the style mosaic features of the second style image, performs fusion processing on the content mosaic features and style mosaic features based on the feature fusion model to be trained, obtains the fusion features, and inputs the fusion features into the compiler to be trained to obtain the actual output image; Input the actual output image and the second style image into the target discriminator to determine the first loss value; input the actual output image and the second style image into the target style comparison device to determine the style loss value; based on the first loss value and the style loss value, modify the model parameters in the style processing model to be trained in the target model to be trained, and take the convergence of the loss function in the style processing model to be trained as the training target to obtain the target model to be used.
其中,可以第二风格图像和第二原始图像进行随机组合,得到多个第二训练样本。每一个训练样本中包括一幅第二风格图像和一幅第二原始图像。对每个第二训练样本的处理方式都是相同,以对其中一个训练样本处理为例来介绍。Wherein, the second style image and the second original image may be randomly combined to obtain a plurality of second training samples. Each training sample includes a second style image and a second original image. The processing method for each second training sample is the same, and the processing of one of the training samples is taken as an example for introduction.
例如,将第二原始图像和第二风格图像输入至目标待训练模型中。基于待训练内容特征 提取器获取第二原始图像的图像内容,即内容拼接特征;基于待训练风格特征提取器获取第二风格图像的风格拼接特征。基于待训练特征融合模型对内容拼接特征和风格拼接特征进行融合处理,得到融合特征。基于待训练编译器对融合特征编译处理,得到实际输出图像。其中,理想中的实际输出图像应该是包括第二原始图像的图像内容,以及第二风格图像的风格特征。但是,由于待训练风格处理模型中的模型参数为默认值,因此得到的实际处处图像和理想情况下得到的图像有一定的差异,此时可以基于目标待训练模型中的目标判别器和目标风格比对器进行处理。将实际输出图像和与实际输出图像相对应的第二风格图像输入至目标判别器中,可以得到第一损失值;同时,将实际输出图像和与实际输出图像相对应的第二风格图像输入至对目标风格比对器中,确定风格损失值。基于第一损失值和风格损失值,可以对待训练风格处理模型中的模型参数进行修正。将待训练风格处理模型中的损失函数收敛作为训练目标,得到目标待使用模型。For example, input the second original image and the second style image into the target model to be trained. Obtain the image content of the second original image based on the content feature extractor to be trained, i.e. the content mosaic feature; obtain the style mosaic feature of the second style image based on the style feature extractor to be trained. Based on the feature fusion model to be trained, the content mosaic feature and style mosaic feature are fused to obtain the fusion feature. Compile and process the fusion features based on the compiler to be trained to obtain the actual output image. Wherein, the ideal actual output image should include the image content of the second original image and the style features of the second style image. However, since the model parameters in the style processing model to be trained are default values, there are some differences between the obtained actual image and the ideally obtained image. At this time, it can be based on the target discriminator and target style in the target model to be trained Comparator for processing. Input the actual output image and the second style image corresponding to the actual output image into the target discriminator to obtain the first loss value; at the same time, input the actual output image and the second style image corresponding to the actual output image into For the target style comparator, determine the style loss value. Based on the first loss value and the style loss value, model parameters in the style processing model to be trained can be corrected. Take the convergence of the loss function in the style processing model to be trained as the training target, and obtain the target model to be used.
示例性的,第二原始图像为A,第二风格图像为B,将第二原始图像A和第二风格图像B输入至目标待训练模型中后。基于待训练内容特征提取器获取原始图像A的图像内容,同时,基于待训练风格特征提取器获取风格图像B的风格特征;基于待训练特征融合模型对图像内容和风格特征拼接处理,得到与实际输出图像C所对应的融合特征。基于待训练编译器对融合特征编译处理,得到实际输出图像C。将实际输出图像C和风格图像B输入至目标风格比对器中,得到风格损失值;以及,将实际输出图像C和风格图像B输入至目标判别器中,得到第一损失值。基于第一损失值和风格损失值,对目标待训练模型中的待训练风格处理模型中的模型参数进行修正,直至待训练风格处理模型中的损失函数收敛,得到目标待使用模型。Exemplarily, the second original image is A, and the second style image is B. After inputting the second original image A and the second style image B into the target model to be trained. The image content of the original image A is obtained based on the content feature extractor to be trained, and at the same time, the style features of the style image B are obtained based on the style feature extractor to be trained; the image content and style features are spliced based on the feature fusion model to be trained, and the actual Output the fused features corresponding to image C. Compile and process the fusion feature based on the compiler to be trained to obtain the actual output image C. Input the actual output image C and the style image B into the target style comparison device to obtain the style loss value; and input the actual output image C and the style image B into the target discriminator to obtain the first loss value. Based on the first loss value and the style loss value, the model parameters in the style processing model to be trained in the target model to be trained are corrected until the loss function in the style processing model to be trained converges, and the target model to be used is obtained.
S340、将所述目标待使用模型中训练好的待训练风格处理模型,作为待使用风格模型。S340. Use the to-be-trained style processing model trained in the target to-be-used model as the to-be-used style model.
例如,在得到目标待使用模型后,可以剔除目标待使用模型中的目标判别器和目标风格比对器,即,仅保留训练好的待训练风格处理模型,从而得到待使用风格模型。在实际应用中,在训练得到待使用风格模型之后,可以随机将一幅原始图像和某种比较喜欢的风格类型图像,输入至待使用风格转换模型中之后,得到与某种风格类型相一致的图像,此时,该图像中的内容与原始图像中的内容相一致。For example, after the target to-be-used model is obtained, the target discriminator and the target style comparer in the target to-be-used model can be eliminated, that is, only the trained style processing model to be trained is retained, so as to obtain the to-be-used style model. In practical applications, after training the style model to be used, an original image and an image of a preferred style type can be randomly input into the style conversion model to be used to obtain a style consistent with a certain style type. image, at this point, the content in the image is consistent with the content in the original image.
S350、确定与所述目标风格类型的参照图像,并基于所述参照图像和所述待使用风格模型,确定所述第一风格转换模型。S350. Determine a reference image of the target style type, and determine the first style conversion model based on the reference image and the style model to be used.
其中,目标风格类型为根据用户的偏好度,最终选择出的风格类型。相应的,参照图像为与目标风格类型相一致的图像。可以将参照图像与待使用风格模型绑定,以使输入一幅原始图像后,基于待使用风格模型提取参照图像的风格特征,并将第一原始图像的图像内容与风格特征融合,得到与目标风格类型相一致的图像。将参照图像与待使用风格模型绑定后的模型,作为第一风格转换模型。Wherein, the target style type is the style type finally selected according to the preference of the user. Correspondingly, the reference image is an image consistent with the target style type. The reference image can be bound with the style model to be used, so that after an original image is input, the style features of the reference image are extracted based on the style model to be used, and the image content and style features of the first original image are fused to obtain Images that match the style type. The model after binding the reference image and the style model to be used is used as the first style conversion model.
需要说明的,由于第一风格转换模型中的模型结构比较复杂,将此部署在移动终端设备上时,可以存在算力不足的情形,基于此,可以将第一风格转换模型部署在服务端上,以使服务端可以对图像进行风格转换处理。It should be noted that since the model structure in the first style conversion model is relatively complex, when it is deployed on a mobile terminal device, there may be insufficient computing power. Based on this, the first style conversion model can be deployed on the server , so that the server can perform style conversion processing on the image.
本公开实施例的技术方案,可以基于目标图像生成模型,生成多种风格类型的第二风格图像,基于第二风格图像和原始图像对目标待训练模型进行训练处理,得到目标待使用模型,对目标待使用模型和预先选择的某一种风格类型的图像封装处理,得到第一风格转换模型。该第一风格转换模型可以将输入的原始图像转换为与封装的风格类型相一致的目标特效图像,从而可以对采集的多种图像进行处理,提高了样本获取的便捷性和图像内容处理的便捷性。The technical solutions of the embodiments of the present disclosure can generate second style images of various styles based on the target image generation model, and perform training on the target model to be trained based on the second style image and the original image to obtain the target model to be used, and The target model to be used and the pre-selected image of a certain style type are packaged and processed to obtain the first style conversion model. The first style conversion model can convert the input original image into a target special effect image consistent with the packaged style type, so that various collected images can be processed, and the convenience of sample acquisition and image content processing is improved. sex.
图5为本公开另一实施例所提供的一种图像处理方法流程示意图,在前述实施例的基础 上,在得到第一风格转换模型之后,可以基于第一风格转换模型来构造第一训练样本,进而基于第一训练样本对待训练风格转换模型进行处理的,得到目标风格转换模型,其示例的实施方式可以参见本技术方案的详细阐述。其中,与上述实施例相同或者相应的技术术语在此不再赘述。Fig. 5 is a schematic flowchart of an image processing method provided by another embodiment of the present disclosure. On the basis of the foregoing embodiments, after obtaining the first style conversion model, the first training sample can be constructed based on the first style conversion model , and then process the to-be-trained style conversion model based on the first training sample to obtain a target style conversion model. For an example implementation, please refer to the detailed description of the technical solution. Wherein, technical terms that are the same as or corresponding to those in the foregoing embodiments will not be repeated here.
如图5所示,所述方法包括:As shown in Figure 5, the method includes:
需要说明的是,基于上述可知,第一风格转换模型由于对算力要求较高,因此无法直接将其部署在终端设备上,基于此可以基于第一风格转换模型,构造出相应的训练样本,训练得到可以部署在终端设备上的目标风格转换模型。It should be noted that, based on the above, the first style conversion model cannot be directly deployed on terminal devices due to its high requirements on computing power. Based on this, corresponding training samples can be constructed based on the first style conversion model. The trained target style transfer model can be deployed on the terminal device.
S410、获取到的至少一幅第一原始图像。S410. Acquire at least one first original image.
其中,第一原始图像可以是基于摄像装置在真实环境中采集的图像,也可以是基于某种图像生成模型生成的图像。为了提高模型训练的准确性,可以尽可能多的获取第一原始图像。第一原始图像可以有相应的风格,也可以没有相应的风格。Wherein, the first original image may be an image collected by a camera device in a real environment, or an image generated based on a certain image generation model. In order to improve the accuracy of model training, as many first original images as possible can be obtained. The first original image may or may not have a corresponding style.
还需要说明的,可以对第一原始图像进行多种亮度的变化,例如,对整幅图像进行亮度调整,也可以是仅对原始图像中的面部图像进行亮度变化。这样可以进行随机亮度校正,以使训练得到的网格光照条件更加随机。同时,为了突出面部的风格转换效果,可以提取图像中的面部图像素点,对面部像素点进行亮度增亮处理。It should also be noted that various brightness changes may be performed on the first original image, for example, brightness adjustment may be performed on the entire image, or brightness changes may be performed only on the face image in the original image. This allows for a random brightness correction to make the lighting conditions of the trained mesh more random. At the same time, in order to highlight the style conversion effect of the face, the face image pixels in the image can be extracted, and the brightness of the face pixels can be brightened.
S420、针对第一原始图像,基于所述第一风格转换模型对当前第一原始图像进行风格化处理,得到与所述目标风格类型相一致的第一风格图像。S420. For the first original image, stylize the current first original image based on the first style conversion model to obtain a first style image consistent with the target style type.
其中,第一风格图像的风格类型与目标风格类型相一致。第一风格图像是基于预先训练得到的第一风格转换模型生成。第一风格转换模型所对应的风格类型与预先绑定的图像风格类型绑定,即,可以根据用户对多种风格图像的偏好度,选择其中一种风格类型的图像与预先训练好的目标模型绑定,从而得到第一风格转换模型。第一风格转换模型可以部署在服务端上,以在接收到待处理图像后,可以基于第一风格转换模型对绑定的图像和待处理图像处理,得到目标风格类型的目标特效图像,并展示在客户端。但是,此模型的计算量非常大,不适用部署在终端设备上,因此可以基于第一风格转换模型来得到相应的训练样本,进而基于相应的训练样本来训练得到上述目标风格转换模型。Wherein, the style type of the first style image is consistent with the target style type. The first style image is generated based on the pre-trained first style transfer model. The style type corresponding to the first style conversion model is bound to the pre-bound image style type, that is, according to the user's preference for multiple style images, one of the style types of images can be selected and the pre-trained target model Binding, so as to obtain the first style conversion model. The first style conversion model can be deployed on the server, so that after receiving the image to be processed, the bound image and the image to be processed can be processed based on the first style conversion model to obtain the target special effect image of the target style type, and display on the client side. However, the calculation amount of this model is very large, and it is not suitable for deployment on terminal devices. Therefore, corresponding training samples can be obtained based on the first style conversion model, and then the above-mentioned target style conversion model can be obtained by training based on the corresponding training samples.
例如,基于预先训练好的第一风格转换模型对采集的第一原始图像进行处理,得到与第一风格转换模型的风格类型相一致的第一风格图像。基于此种方式,获取多个训练样本。For example, the collected first original image is processed based on the pre-trained first style conversion model to obtain the first style image consistent with the style type of the first style conversion model. Based on this method, multiple training samples are obtained.
在本实施例中,基于第一原始图像得到相应的第一风格图像可以是:基于所述内容特征提取器对所述当前第一原始图像进行内容提取,得到图像内容特征;基于所述风格特征提取器,对预先设置的与所述目标风格类型相一致的参照风格图像进行风格特征提取,得到图像风格特征;基于所述特征融合器对所述图像内容特征和所述图像风格特征融合处理,得到待编译特征;基于所述编译器对所述待编译特征处理,得到与所述当前第一原始图像相对应的第一风格图像。In this embodiment, obtaining the corresponding first style image based on the first original image may be: performing content extraction on the current first original image based on the content feature extractor to obtain image content features; based on the style feature The extractor extracts style features from a preset reference style image that is consistent with the target style type to obtain image style features; based on the feature fuser, the image content features and the image style features are fused, Obtaining features to be compiled; obtaining a first style image corresponding to the current first original image based on the processing of the features to be compiled by the compiler.
例如,可以结合图4来说,将将第一原始图像输入至第一风格转换模型中后,基于内容特征提取器获取第一原始图像的内容,基于风格特征提取器获取参照图像的风格特征,基于特征融合器对图像内容和风格特征融合处理,得到融合特征。将融合特征输入至编译器中,得到第一原始图像在目标风格类型下的图像。For example, in conjunction with FIG. 4, after the first original image is input into the first style conversion model, the content of the first original image is obtained based on the content feature extractor, and the style feature of the reference image is obtained based on the style feature extractor. Based on the feature fuser, the image content and style features are fused to obtain the fused features. Input the fusion feature into the compiler to obtain the image of the first original image under the target style type.
S430、基于第一原始图像和相应的第一风格图像,确定所述多个第一训练样本。S430. Determine the plurality of first training samples based on the first original image and the corresponding first style image.
例如,基于第一风格转换模型可以对第一原始图像进行风格转换处理,得到第一原始图像在目标风格类型下的第一风格图像。基于第一原始图像和相应的第一风格图像,确定第一训练样本。For example, based on the first style conversion model, the first original image may be subjected to style conversion processing to obtain a first style image of the first original image under the target style type. Based on the first original image and the corresponding first style image, a first training sample is determined.
S440、针对第一训练样本,将当前第一训练样本中的第一原始图像和原始图像中待处理对象的对象属性,作为待训练风格转换模型的输入,将当前第一训练样本中的第一风格图像,作为待训练风格模型的输出,训练得到目标风格转换模型。S440. For the first training sample, use the first original image in the current first training sample and the object attributes of the object to be processed in the original image as input to the style transfer model to be trained, and use the first original image in the current first training sample The style image, as the output of the style model to be trained, is trained to obtain the target style transfer model.
其中,待训练风格转换模型中的模型参数为默认值。对象属性可以是原始图像中目标主体的性别属性。可以基于多个第一训练样本来对待训练风格转换模型进行训练,以得到目标风格转换模型。Among them, the model parameters in the style transfer model to be trained are default values. The object attribute may be the gender attribute of the target subject in the original image. The style transfer model to be trained may be trained based on a plurality of first training samples to obtain a target style transfer model.
需要说明的是,对每个第一训练样本的处理方式都是相同的,可以以对其中一个训练样本处理为例来介绍。It should be noted that the processing manner for each first training sample is the same, and the processing of one of the training samples may be used as an example for introduction.
例如,第一训练样本中包括对应原始图像和与第一原始图像相对应的第一风格图像。将第一原始图像输入至待训练风格转换模型之前,可以基于相应算法确定第一原始图像中目标对象的性别属性,将性别属性和第一原始图像输入至待训练风格转换模型中,将与第一原始图像相对应的第一风格图像作为待训练风格转换模型的输出,对待训练风格转换模型进行训练,得到目标风格转换模型。For example, the first training sample includes a corresponding original image and a first style image corresponding to the first original image. Before inputting the first original image into the style conversion model to be trained, the gender attribute of the target object in the first original image can be determined based on the corresponding algorithm, and the gender attribute and the first original image are input into the style conversion model to be trained, which will be compared with the second The first style image corresponding to the original image is used as the output of the style conversion model to be trained, and the style conversion model to be trained is trained to obtain the target style conversion model.
例如,在基于第一训练样本进行模型训练之前,还需要将图像的尺寸放缩为一定的尺寸,以提高模型对图像处理的效率。同时,将性别属性为女性的标签设定为0,将性别属性为男性的标签设定为1,构建四通道,以训练得到一个目标风格转换模型,该目标风格转换模型可以对不同性别属性的图像进行处理。For example, before performing model training based on the first training sample, the size of the image needs to be scaled to a certain size, so as to improve the efficiency of the model for image processing. At the same time, the label whose gender attribute is female is set to 0, and the label whose gender attribute is male is set to 1, and four channels are constructed to train a target style conversion model, which can be used for different gender attributes The image is processed.
S450、获取包括目标主体的待处理图像。S450. Acquire an image to be processed including the target subject.
S460、将待处理图像和目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像。S460. Input the image to be processed and the subject attribute of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into the target style type.
S470、将目标特效图像于图像展示区域中显示。S470. Display the target special effect image in the image display area.
本公开实施例的技术方案,可以训练得到部署在终端设备上的目标风格转换模型,以在采集到待处理图像时,可以基于客户端对图像进行风格类型转换,提高了图像处理的便捷性。The technical solution of the embodiment of the present disclosure can train the target style conversion model deployed on the terminal device, so that when the image to be processed is collected, the style type conversion can be performed on the image based on the client, which improves the convenience of image processing.
图6为本公开实施例所提供的一种图像处理装置结构示意图,所述装置包括:待处理图像采集模块510、特效图像确定模块520以及图像显示模块530。FIG. 6 is a schematic structural diagram of an image processing device provided by an embodiment of the present disclosure. The device includes: an image acquisition module 510 to be processed, a special effect image determination module 520 and an image display module 530 .
其中,待处理图像采集模块510,设置为获取包括目标主体的待处理图像;特效图像确定模块,520设置为将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像;图像显示模块530,设置为将所述目标特效图像于图像展示区域中显示。Wherein, the to-be-processed image acquisition module 510 is set to acquire the to-be-processed image including the target subject; the special effect image determination module 520 is set to input the subject attribute of the to-be-processed image and the target subject into the target style conversion model , to obtain the target special effect image converted from the target subject into the target style type; the image display module 530 is configured to display the target special effect image in the image display area.
在上述技术方案的基础上,所述装置包括:On the basis of the above technical solutions, the device includes:
第一训练样本获取模块,设置为获取多个第一训练样本;其中,所述第一训练样本中包括第一原始图像和与所述目标风格类型相一致的第一风格图像;所述第一风格图像是第一风格转换模型生成的;The first training sample acquisition module is configured to acquire a plurality of first training samples; wherein, the first training samples include a first original image and a first style image consistent with the target style type; the first The style image is generated by the first style transfer model;
第一训练模块,设置为针对第一训练样本,将当前第一训练样本中的第一原始图像和所述待处理对象的对象属性,作为待训练风格转换模型的输入,将所述当前第一训练样本中的第一风格图像,作为所述待训练风格模型的输出,训练得到所述目标风格转换模型;其中,所述对象属性与所述主体属性相匹配。The first training module is configured to use the first original image in the current first training sample and the object attribute of the object to be processed as the input of the style conversion model to be trained for the first training sample, and use the current first training sample The first style image in the training sample is used as the output of the style model to be trained to obtain the target style conversion model through training; wherein the object attribute matches the subject attribute.
在上述技术方案的基础上,所述第一训练样本获取模块,包括:On the basis of the above technical solution, the first training sample acquisition module includes:
第一原始图像获取单元,设置为获取至少一幅第一原始图像;a first original image acquisition unit, configured to acquire at least one first original image;
第一风格图像获取单元,设置为针对第一原始图像,基于所述第一风格转换模型对当前第一原始图像风格化处理,得到与所述目标风格类型相一致的第一风格图像;The first style image acquisition unit is configured to stylize the current first original image based on the first style conversion model for the first original image, to obtain a first style image consistent with the target style type;
第一训练样本获取单元,设置为基于第一原始图像和相应的第一风格图像,确定所述多 个第一训练样本。The first training sample acquisition unit is configured to determine the plurality of first training samples based on the first original image and the corresponding first style image.
在上述技术方案的基础上,所述第一风格转换模型中包括风格特征提取器、内容特征提取器、特征融合器以及编译器,第一风格图像获取单元,设置为:On the basis of the above technical solution, the first style conversion model includes a style feature extractor, a content feature extractor, a feature fusion device and a compiler, and the first style image acquisition unit is set to:
基于所述内容特征提取器对所述当前第一原始图像进行内容提取,得到图像内容特征;基于所述风格特征提取器,对预先设置的与所述目标风格类型相一致的参照风格图像进行风格特征提取,得到图像风格特征;基于所述特征融合器对所述图像内容特征和所述图像风格特征融合处理,得到待编译特征;基于所述编译器对所述待编译特征处理,得到与所述当前第一原始图像相对应的第一风格图像。Based on the content feature extractor, perform content extraction on the current first original image to obtain image content features; based on the style feature extractor, style a preset reference style image that is consistent with the target style type Feature extraction to obtain image style features; based on the feature fuser, the image content features and the image style features are fused to obtain features to be compiled; based on the compiler to process the features to be compiled, to obtain the same The first style image corresponding to the current first original image.
在上述技术方案的基础上,所述装置包括:On the basis of the above technical solutions, the device includes:
第一模型构建单元,设置为构建包括待训练风格处理模型、目标判别器以及目标风格比对器的目标待训练模型;其中,所述目标判别器以及所述目标风格比对器为预先训练好的;待使用模型确定单元,设置为根据至少一种待选择风格类型的多幅第二风格图像,以及至少一幅第二原始图像,对构建的目标待训练模型进行训练,得到目标待使用模型;其中,所述第二风格图像是基于目标图像生成模型确定的;待使用风格模型确定单元,设置为将所述目标待使用模型中训练好的待训练风格处理模型,作为待使用风格模型;第一风格模型确定单元,设置为确定与所述目标风格类型相对应的一幅参照图像,并基于所述参照图像和所述待使用风格模型,确定所述第一风格转换模型;其中,所述目标风格类型为所述至少一种待选择风格类型中的一种。The first model construction unit is configured to construct a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer; wherein, the target discriminator and the target style comparer are pre-trained The model determination unit to be used is configured to train the target model to be trained according to at least one second style image of at least one style type to be selected and at least one second original image to obtain the target model to be used ; Wherein, the second style image is determined based on the target image generation model; the to-be-used style model determining unit is configured to use the to-be-trained style processing model trained in the target to-be-used model as the to-be-used style model; The first style model determination unit is configured to determine a reference image corresponding to the target style type, and determine the first style conversion model based on the reference image and the style model to be used; wherein, the The target style type is one of the at least one style type to be selected.
在上述技术方案的基础上,所述待训练风格处理模型中包括:待训练风格特征提取器、待训练内容特征提取器、待训练特征融合器以及待训练编译器,所述待使用模型确定单元设置为:On the basis of the above technical solution, the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, a feature fusion device to be trained, and a compiler to be trained, and the model determination unit to be used Set as:
通过对第二风格图像和第二原始图像随机组合处理,得到多个第二训练样本;其中,所述第二训练样本中包括一幅第二原始图像和一幅第二风格图像;针对第二训练样本,基于所述待训练内容特征提取器获取所述第二原始图像的内容拼接特征,以及基于所述待训练风格特征提取器获取第二风格图像的风格拼接特征,基于所述待训练特征融合模型对所述内容拼接特征和所述风格拼接特征融合处理,得到融合特征,并将所述融合特征输入待训练编译器中,得到实际输出图像;将所述实际输出图像和所述第二风格图像输入至所述目标判别器中,确定第一损失值,以及将所述实际输出图像和所述第二风格图像输入至所述目标风格比对器中,确定风格损失值;基于所述第一损失值和所述风格损失值,对所述待训练风格处理模型中的模型参数修正处理,将所述待训练风格处理模型中的损失函数收敛作为训练目标,得到所述目标待使用模型。A plurality of second training samples are obtained by randomly combining the second style image and the second original image; wherein, the second training samples include a second original image and a second style image; for the second training samples, obtaining the content splicing features of the second original image based on the content feature extractor to be trained, and obtaining the style splicing features of the second style image based on the style feature extractor to be trained, and obtaining the splicing features of the second style image based on the features to be trained The fusion model fuses the content splicing feature and the style splicing feature to obtain the fusion feature, and inputs the fusion feature into the compiler to be trained to obtain an actual output image; combines the actual output image and the second Inputting the style image into the target discriminator to determine a first loss value, and inputting the actual output image and the second style image into the target style comparer to determine a style loss value; based on the The first loss value and the style loss value are used to modify the model parameters in the style processing model to be trained, and use the convergence of the loss function in the style processing model to be trained as the training target to obtain the target model to be used .
在上述技术方案的基础上,所述第一风格模型确定单元设置为从至少一个待选择风格类型中确定目标风格类型;获取与所述目标风格类型相一致的一幅参照图像,并将所述参照图像与所述待使用风格模型封装,得到所述第一风格转换模型。On the basis of the above technical solution, the first style model determining unit is configured to determine a target style type from at least one style type to be selected; obtain a reference image consistent with the target style type, and The reference image is packaged with the style model to be used to obtain the first style conversion model.
在上述技术方案的基础上,所述装置还包括:On the basis of the above technical solution, the device also includes:
第三风格图像获取单元,设置为获取第三原始图像于待选择风格类型下的待使用图像,对所述待使用图像裁剪处理,得到第三风格图像;The third style image acquiring unit is configured to acquire a third original image in an image to be used under the style type to be selected, and cut the image to be used to obtain a third style image;
第三输出图像获取单元,设置为将高斯噪声输入待训练图像生成模型中,得到第三输出图像;The third output image acquisition unit is configured to input Gaussian noise into the image generation model to be trained to obtain a third output image;
模型参数修正单元,设置为基于第一判别器对所述第三输出图像和第三风格图像处理,确定损失值,并基于所述损失值对所述待训练图像生成模型中的模型参数进行修正;A model parameter correction unit, configured to process the third output image and the third style image based on the first discriminator, determine a loss value, and correct the model parameters in the image generation model to be trained based on the loss value ;
目标图像生成模型确定单元,设置为将所述待训练图像生成模型中的损失函数收敛作为 训练目标,得到所述目标图像生成模型。The target image generation model determination unit is configured to use the convergence of the loss function in the image generation model to be trained as the training target to obtain the target image generation model.
在上述技术方案的基础上,所述装置包括:On the basis of the above technical solutions, the device includes:
第二风格图像获取单元,设置为基于所述目标图像生成模型对高斯噪声进行处理,得到第二风格图像;The second style image acquisition unit is configured to process Gaussian noise based on the target image generation model to obtain a second style image;
第二风格图像更新单元,设置为基于预先训练生成的表情编辑模型,为所述第二风格图像进行表情添加,更新第二风格图像。The second style image updating unit is configured to add expressions to the second style image based on the expression editing model generated through pre-training, and update the second style image.
在上述技术方案的基础上,所述目标风格类型包括日系风格、韩系风格、古装风格、漫画风格或者预先设置的多种待选择风格类型。On the basis of the above technical solution, the target style type includes Japanese style, Korean style, ancient costume style, comic style or multiple preset style types to be selected.
本公开实施例的技术方案,在获取到包括目标主体的待处理图像时,可以确定待处理图像中目标主体的主体属性,将主体属性和待处理图像作为预先训练好的目标风格转换模型的输入参数,得到将待处理图像中目标主体转换为目标风格类型的目标特效图像,可以将待处理图像进行风格类型转换,从而提高用户使用体验。According to the technical solution of the embodiment of the present disclosure, when the image to be processed including the target subject is acquired, the subject attribute of the target subject in the image to be processed can be determined, and the subject attribute and the image to be processed can be used as the input of the pre-trained target style conversion model Parameters, to obtain the target special effect image that converts the target subject in the image to be processed into the target style type, and the image to be processed can be converted into the style type, thereby improving the user experience.
本公开实施例所提供的图像处理装置可执行本公开任意实施例所提供的图像处理方法,具备执行方法相应的功能模块和有益效果。The image processing device provided by the embodiment of the present disclosure can execute the image processing method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
值得注意的是,上述装置所包括的多个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,多个功能单元的具体名称也只是为了便于相互区分,并不用于限制本公开实施例的保护范围。It is worth noting that the multiple units and modules included in the above-mentioned device are only divided according to functional logic, but are not limited to the above-mentioned division, as long as the corresponding functions can be realized; in addition, the specific names of multiple functional units It is only for the convenience of distinguishing each other, and is not used to limit the protection scope of the embodiments of the present disclosure.
图7为本公开实施例所提供的一种电子设备的结构示意图。下面参考图7,其示出了适于用来实现本公开实施例的电子设备(例如图7中的终端设备或服务器)500的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图7示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure. Referring now to FIG. 7 , it shows a schematic structural diagram of an electronic device (such as a terminal device or a server in FIG. 7 ) 500 suitable for implementing an embodiment of the present disclosure. The terminal equipment in the embodiment of the present disclosure may include but not limited to such as mobile phone, notebook computer, digital broadcast receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable multimedia player), vehicle terminal (such as mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers and the like. The electronic device shown in FIG. 7 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
如图7所示,电子设备500可以包括处理装置(例如中央处理器、图形处理器等)501,其可以根据存储在只读存储器(ROM)502中的程序或者从存储装置508加载到随机访问存储器(RAM)503中的程序而执行多种适当的动作和处理。在RAM 503中,还存储有电子设备500操作所需的多种程序和数据。处理装置501、ROM 502以及RAM 503通过总线504彼此相连。编辑/输出(I/O)接口505也连接至总线504。As shown in FIG. 7, an electronic device 500 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 501, which may be randomly accessed according to a program stored in a read-only memory (ROM) 502 or loaded from a storage device 508. Various appropriate actions and processes are executed by programs in the memory (RAM) 503 . In the RAM 503, various programs and data necessary for the operation of the electronic device 500 are also stored. The processing device 501, ROM 502, and RAM 503 are connected to each other through a bus 504. An edit/output (I/O) interface 505 is also connected to the bus 504 .
通常,以下装置可以连接至I/O接口505:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置506;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置507;包括例如磁带、硬盘等的存储装置508;以及通信装置509。通信装置509可以允许电子设备500与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有多种装置的电子设备500,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Typically, the following devices can be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 507 such as a computer; a storage device 508 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to perform wireless or wired communication with other devices to exchange data. While FIG. 7 shows electronic device 500 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置509从网络上被下载和安装,或者从存储装置508被安装,或者从ROM 502被安装。在该计算机程序被处理装置501执行时,执行本公开实施例的方法中限定的上述功能。According to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product, which includes a computer program carried on a non-transitory computer readable medium, where the computer program includes program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 509, or from storage means 508, or from ROM 502. When the computer program is executed by the processing device 501, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are performed.
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。The names of messages or information exchanged between multiple devices in the embodiments of the present disclosure are used for illustrative purposes only, and are not used to limit the scope of these messages or information.
本公开实施例提供的电子设备与上述实施例提供的图像处理方法属于同一构思,未在本实施例中详尽描述的技术细节可参见上述实施例,并且本实施例与上述实施例具有相同的有益效果。The electronic device provided by the embodiment of the present disclosure belongs to the same idea as the image processing method provided by the above embodiment, and the technical details not described in detail in this embodiment can be referred to the above embodiment, and this embodiment has the same benefits as the above embodiment Effect.
本公开实施例提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述实施例所提供的图像处理方法。An embodiment of the present disclosure provides a computer storage medium, on which a computer program is stored, and when the program is executed by a processor, the image processing method provided in the foregoing embodiments is implemented.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
在一些实施方式中,客户端、服务器可以利用诸如超文本传输协议(HyperText Transfer Protocol,HTTP)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。In some embodiments, the client and the server can communicate using any currently known or future network protocols such as Hypertext Transfer Protocol (HyperText Transfer Protocol, HTTP), and can communicate with digital data in any form or medium The communication (eg, communication network) interconnections. Examples of communication networks include local area networks ("LANs"), wide area networks ("WANs"), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device:
获取包括目标主体的待处理图像;Obtain an image to be processed including a target subject;
将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像;Inputting the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
将所述目标特效图像于图像展示区域中显示。The target special effect image is displayed in the image display area.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Includes conventional procedural programming languages - such as the "C" language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
附图中的流程图和框图,图示了按照本公开多种实施例的系统、方法和计算机程序产品 的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。The units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。The functions described herein above may be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), System on Chips (SOCs), Complex Programmable Logical device (CPLD) and so on.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
根据本公开的一个或多个实施例,【示例一】提供了一种图像处理方法,该方法包括:According to one or more embodiments of the present disclosure, [Example 1] provides an image processing method, the method including:
获取包括目标主体的待处理图像;Obtain an image to be processed including a target subject;
将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像;Inputting the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
将所述目标特效图像于图像展示区域中显示。The target special effect image is displayed in the image display area.
根据本公开的一个或多个实施例,【示例二】提供了一种图像处理方法,该方法还包括:According to one or more embodiments of the present disclosure, [Example 2] provides an image processing method, the method further includes:
获取多个第一训练样本;其中,所述第一训练样本中包括第一原始图像和与所述目标风格类型相一致的第一风格图像;所述第一风格图像是第一风格转换模型生成的;Obtain a plurality of first training samples; wherein, the first training samples include a first original image and a first style image consistent with the target style type; the first style image is generated by the first style conversion model of;
针对第一训练样本,将当前第一训练样本中的第一原始图像和所述待处理对象的对象属性,作为待训练风格转换模型的输入,将所述当前第一训练样本中的第一风格图像,作为所述待训练风格模型的输出,训练得到所述目标风格转换模型;For the first training sample, the first original image in the current first training sample and the object attribute of the object to be processed are used as the input of the style conversion model to be trained, and the first style in the current first training sample is An image, as the output of the style model to be trained, is trained to obtain the target style conversion model;
其中,所述对象属性与所述主体属性相匹配。Wherein, the object attribute matches the subject attribute.
根据本公开的一个或多个实施例,【示例三】提供了一种图像处理方法,其中,According to one or more embodiments of the present disclosure, [Example 3] provides an image processing method, wherein,
所述获取多个第一训练样本,包括:The acquisition of multiple first training samples includes:
获取至少一幅第一原始图像;acquiring at least one first original image;
针对第一原始图像,基于所述第一风格转换模型对当前第一原始图像风格化处理,得到与所述目标风格类型相一致的第一风格图像;For the first original image, stylize the current first original image based on the first style conversion model to obtain a first style image consistent with the target style type;
基于第一原始图像和相应的第一风格图像,确定所述多个第一训练样本。The plurality of first training samples is determined based on the first original image and the corresponding first style image.
根据本公开的一个或多个实施例,【示例四】提供了一种图像处理方法,其中,所述第一 风格转换模型中包括风格特征提取器、内容特征提取器、特征融合器以及编译器,所述基于所述第一风格转换模型对当前第一原始图像风格化处理,得到与所述目标风格类型相一致的第一风格图像,包括:According to one or more embodiments of the present disclosure, [Example 4] provides an image processing method, wherein the first style conversion model includes a style feature extractor, a content feature extractor, a feature fusion unit, and a compiler , the stylization processing of the current first original image based on the first style conversion model to obtain a first style image consistent with the target style type, including:
基于所述内容特征提取器对所述当前第一原始图像进行内容提取,得到图像内容特征;performing content extraction on the current first original image based on the content feature extractor to obtain image content features;
基于所述风格特征提取器,对预先设置的与所述目标风格类型相一致的参照风格图像进行风格特征提取,得到图像风格特征;Based on the style feature extractor, performing style feature extraction on a preset reference style image consistent with the target style type to obtain image style features;
基于所述特征融合器对所述图像内容特征和所述图像风格特征融合处理,得到待编译特征;merging the image content features and the image style features based on the feature fuser to obtain features to be compiled;
基于所述编译器对所述待编译特征处理,得到与所述当前第一原始图像相对应的第一风格图像。Based on the compiler processing the features to be compiled, a first style image corresponding to the current first original image is obtained.
根据本公开的一个或多个实施例,【示例五】提供了一种图像处理方法,该方法还包括:According to one or more embodiments of the present disclosure, [Example 5] provides an image processing method, the method further includes:
构建包括待训练风格处理模型、目标判别器以及目标风格比对器的目标待训练模型;其中,所述目标判别器以及所述目标风格比对器为预先训练好的;Constructing a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer; wherein, the target discriminator and the target style comparer are pre-trained;
根据至少一种待选择风格类型的多幅第二风格图像,以及至少一幅第二原始图像,对构建的目标待训练模型进行训练,得到目标待使用模型;其中,所述第二风格图像是基于目标图像生成模型确定的;According to multiple second style images of at least one style type to be selected, and at least one second original image, the constructed target model to be trained is trained to obtain the target model to be used; wherein the second style image is determined based on the target image generation model;
将所述目标待使用模型中训练好的待训练风格处理模型,作为待使用风格模型;Using the style processing model to be trained trained in the target model to be used as the style model to be used;
确定与所述目标风格类型相对应的一幅参照图像,并基于所述参照图像和所述待使用风格模型,确定所述第一风格转换模型;其中,所述目标风格类型为所述至少一种待选择风格类型中的一种。determining a reference image corresponding to the target style type, and determining the first style conversion model based on the reference image and the style model to be used; wherein the target style type is the at least one One of the style types to be selected.
根据本公开的一个或多个实施例,【示例六】提供了一种图像处理方法,其中,所述待训练风格处理模型中包括:待训练风格特征提取器、待训练内容特征提取器、待训练特征融合器以及待训练编译器,所述根据至少一种待选择风格类型的多幅第二风格图像,以及至少一幅第二原始图像,对构建的目标待训练模型进行训练,得到目标待使用模型,包括:According to one or more embodiments of the present disclosure, [Example 6] provides an image processing method, wherein the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, The training feature fusion device and the compiler to be trained, the target model to be trained is trained according to the multiple second style images of at least one style type to be selected, and at least one second original image, and the target to be trained model is obtained. Use models, including:
通过对第二风格图像和第二原始图像随机组合处理,得到多个第二训练样本;其中,所述第二训练样本中包括一幅第二原始图像和一幅第二风格图像;A plurality of second training samples are obtained by randomly combining the second style image and the second original image; wherein, the second training samples include a second original image and a second style image;
针对第二训练样本,基于所述待训练内容特征提取器获取所述第二原始图像的内容拼接特征,以及基于所述待训练风格特征提取器获取第二风格图像的风格拼接特征,基于所述待训练特征融合模型对所述内容拼接特征和所述风格拼接特征融合处理,得到融合特征,并将所述融合特征输入待训练编译器中,得到实际输出图像;For the second training sample, the content splicing feature of the second original image is obtained based on the content feature extractor to be trained, and the style splicing feature of the second style image is obtained based on the style feature extractor to be trained, based on the The feature fusion model to be trained fuses the content splicing features and the style splicing features to obtain the fusion features, and inputs the fusion features into the compiler to be trained to obtain the actual output image;
将所述实际输出图像和所述第二风格图像输入至所述目标判别器中,确定第一损失值,以及将所述实际输出图像和所述第二风格图像输入至所述目标风格比对器中,确定风格损失值;inputting the actual output image and the second style image into the target discriminator, determining a first loss value, and inputting the actual output image and the second style image into the target style comparison In the device, determine the style loss value;
基于所述第一损失值和所述风格损失值,对所述待训练风格处理模型中的模型参数修正处理,将所述待训练风格处理模型中的损失函数收敛作为训练目标,得到所述目标待使用模型。Based on the first loss value and the style loss value, correcting the model parameters in the style processing model to be trained, taking the convergence of the loss function in the style processing model to be trained as a training target, and obtaining the target The model to be used.
根据本公开的一个或多个实施例,【示例七】提供了一种图像处理方法,其中,所述确定与所述目标风格类型相对应的一幅参照图像,并基于所述参照图像和所述待使用风格模型,确定所述第一风格转换模型,包括:According to one or more embodiments of the present disclosure, [Example 7] provides an image processing method, wherein the determining a reference image corresponding to the target style type is based on the reference image and the determined Describe the style model to be used, and determine the first style conversion model, including:
从至少一个待选择风格类型中确定目标风格类型;determining a target style type from at least one style type to be selected;
获取与所述目标风格类型相一致的一幅参照图像,并将所述参照图像与所述待使用风格模型封装,得到所述第一风格转换模型。Acquiring a reference image consistent with the target style type, and encapsulating the reference image with the style model to be used to obtain the first style conversion model.
根据本公开的一个或多个实施例,【示例八】提供了一种图像处理方法,该方法还包括:According to one or more embodiments of the present disclosure, [Example 8] provides an image processing method, the method further includes:
获取第三原始图像于待选择风格类型下的待使用图像,对所述待使用图像裁剪处理,得到第三风格图像;Acquiring the image to be used of the third original image under the style type to be selected, and cutting the image to be used to obtain the image of the third style;
将高斯噪声输入待训练图像生成模型中,得到第三输出图像;Gaussian noise is input in the image generation model to be trained, obtains the 3rd output image;
基于第一判别器对所述第三输出图像和第三风格图像处理,确定损失值,并基于所述损失值对所述待训练图像生成模型中的模型参数进行修正;Processing the third output image and the third style image based on the first discriminator, determining a loss value, and correcting model parameters in the image generation model to be trained based on the loss value;
将所述待训练图像生成模型中的损失函数收敛作为训练目标,得到所述目标图像生成模型。Taking the convergence of the loss function in the image generation model to be trained as the training target to obtain the target image generation model.
根据本公开的一个或多个实施例,【示例九】提供了一种图像处理方法,该方法还包括:According to one or more embodiments of the present disclosure, [Example 9] provides an image processing method, the method further includes:
基于所述目标图像生成模型对高斯噪声进行处理,得到第二风格图像;processing Gaussian noise based on the target image generation model to obtain a second style image;
基于预先训练生成的表情编辑模型,为所述第二风格图像进行表情添加,更新第二风格图像。Adding expression to the second style image based on the expression editing model generated through pre-training, and updating the second style image.
根据本公开的一个或多个实施例,【示例十】提供了一种图像处理方法,其中,所述目标风格类型包括日系风格、韩系风格、古装风格、漫画风格或者预先设置的多种待选择风格类型。According to one or more embodiments of the present disclosure, [Example 10] provides an image processing method, wherein the target style type includes Japanese style, Korean style, ancient costume style, comic style, or a variety of preset Select a style type.
根据本公开的一个或多个实施例,【示例十一】提供了一种图像处理装置,该装置包括:According to one or more embodiments of the present disclosure, [Example Eleven] provides an image processing device, which includes:
待处理图像采集模块,设置为获取包括目标主体的待处理图像;The image acquisition module to be processed is configured to acquire the image to be processed including the target subject;
特效图像确定模块,设置为将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将目标主体转换为目标风格类型的目标特效图像;The special effect image determination module is configured to input the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
图像显示模块,设置为将所述目标特效图像于图像展示区域中显示。The image display module is configured to display the target special effect image in the image display area.
此外,虽然采用特定次序描绘了多种操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的多种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。In addition, while various operations are depicted in a particular order, this should not be understood as requiring that these operations be performed in the particular order shown or to be performed in a sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
本公开实施例的技术方案,实现了可以将待处理图像转换为与目标主题风格的图像,提高了图像显示内容的丰富性,主题风格的新颖性以及主题风格与用户适配性的效果。The technical solution of the embodiment of the present disclosure realizes the conversion of the image to be processed into an image with the target theme style, which improves the richness of the image display content, the novelty of the theme style, and the adaptability between the theme style and the user.

Claims (13)

  1. 一种图像处理方法,包括:An image processing method, comprising:
    获取包括目标主体的待处理图像;Obtain an image to be processed including a target subject;
    将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将所述目标主体转换为目标风格类型的目标特效图像;Inputting the image to be processed and the subject attributes of the target subject into a target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
    将所述目标特效图像在图像展示区域中显示。The target special effect image is displayed in the image display area.
  2. 根据权利要求1所述的方法,还包括:The method according to claim 1, further comprising:
    获取多个第一训练样本;其中,所述第一训练样本中包括第一原始图像和与所述目标风格类型相一致的第一风格图像;所述第一风格图像是通过第一风格转换模型生成的;Obtaining a plurality of first training samples; wherein, the first training samples include a first original image and a first style image consistent with the target style type; the first style image is obtained through the first style conversion model Generated;
    针对第一训练样本,将当前第一训练样本中的第一原始图像和所述待处理对象的对象属性,作为待训练风格转换模型的输入,将所述当前第一训练样本中的第一风格图像,作为所述待训练风格模型的输出,训练得到所述目标风格转换模型;For the first training sample, the first original image in the current first training sample and the object attribute of the object to be processed are used as the input of the style conversion model to be trained, and the first style in the current first training sample is An image, as the output of the style model to be trained, is trained to obtain the target style conversion model;
    其中,所述对象属性与所述主体属性相一致。Wherein, the object attribute is consistent with the subject attribute.
  3. 根据权利要求2所述的方法,其中,所述获取多个第一训练样本,包括:The method according to claim 2, wherein said acquiring a plurality of first training samples comprises:
    获取至少一幅第一原始图像;acquiring at least one first original image;
    针对第一原始图像,基于所述第一风格转换模型对当前第一原始图像风格化处理,得到与所述目标风格类型相一致的第一风格图像;For the first original image, stylize the current first original image based on the first style conversion model to obtain a first style image consistent with the target style type;
    基于第一原始图像和相应的第一风格图像,确定所述多个第一训练样本。The plurality of first training samples is determined based on the first original image and the corresponding first style image.
  4. 根据权利要求3所述的方法,其中,所述第一风格转换模型中包括风格特征提取器、内容特征提取器、特征融合器以及编译器,所述基于所述第一风格转换模型对当前第一原始图像风格化处理,得到与所述目标风格类型相一致的第一风格图像,包括:The method according to claim 3, wherein, the first style conversion model includes a style feature extractor, a content feature extractor, a feature fusion device, and a compiler, and the current first style conversion model is based on the first style conversion model. An original image stylization process to obtain a first style image consistent with the target style type, including:
    基于所述内容特征提取器对所述当前第一原始图像进行内容提取,得到图像内容特征;performing content extraction on the current first original image based on the content feature extractor to obtain image content features;
    基于所述风格特征提取器,对预先设置的与所述目标风格类型相一致的参照风格图像进行风格特征提取,得到图像风格特征;Based on the style feature extractor, performing style feature extraction on a preset reference style image consistent with the target style type to obtain image style features;
    基于所述特征融合器对所述图像内容特征和所述图像风格特征融合处理,得到待编译特征;merging the image content features and the image style features based on the feature fuser to obtain features to be compiled;
    基于所述编译器对所述待编译特征处理,得到与所述当前第一原始图像相对应的第一风格图像。Based on the compiler processing the features to be compiled, a first style image corresponding to the current first original image is obtained.
  5. 根据权利要求3所述的方法,还包括:The method according to claim 3, further comprising:
    构建包括待训练风格处理模型、目标判别器以及目标风格比对器的目标待训练模型;其中,所述目标判别器以及所述目标风格比对器为预先训练好的;Constructing a target model to be trained including a style processing model to be trained, a target discriminator, and a target style comparer; wherein, the target discriminator and the target style comparer are pre-trained;
    根据至少一种待选择风格类型的多幅第二风格图像,以及至少一幅第二原始图像,对构建的目标待训练模型进行训练,得到目标待使用模型;其中,所述第二风格图像是基于目标图像生成模型确定的;According to multiple second style images of at least one style type to be selected, and at least one second original image, the constructed target model to be trained is trained to obtain the target model to be used; wherein the second style image is Determined based on the target image generation model;
    将所述目标待使用模型中训练好的待训练风格处理模型,作为待使用风格模型;Using the style processing model to be trained trained in the target model to be used as the style model to be used;
    确定与所述目标风格类型相对应的一幅参照图像,并基于所述参照图像和所述待使用风格模型,确定所述第一风格转换模型;其中,所述目标风格类型为所述至少一种待选择风格类型中的一种。determining a reference image corresponding to the target style type, and determining the first style conversion model based on the reference image and the style model to be used; wherein the target style type is the at least one One of the style types to be selected.
  6. 根据权利要求5所述的方法,其中,所述待训练风格处理模型中包括:待训练风格特征提取器、待训练内容特征提取器、待训练特征融合器以及待训练编译器,所述根据至少一种待选择风格类型的多幅第二风格图像,以及至少一幅第二原始图像,对构建的目标待训练模型进行训练,得到目标待使用模型,包括:The method according to claim 5, wherein the style processing model to be trained includes: a style feature extractor to be trained, a content feature extractor to be trained, a feature fuser to be trained, and a compiler to be trained, and the A plurality of second style images of the style type to be selected, and at least one second original image, are trained on the target model to be trained to obtain the target model to be used, including:
    通过对第二风格图像和第二原始图像随机组合处理,得到多个第二训练样本;其中,所 述第二训练样本中包括一幅第二原始图像和一幅第二风格图像;By randomly combining the second style image and the second original image, a plurality of second training samples are obtained; wherein, the second training samples include a second original image and a second style image;
    针对第二训练样本,基于所述待训练内容特征提取器获取所述第二原始图像的内容拼接特征,以及基于所述待训练风格特征提取器获取所述第二风格图像的风格拼接特征,基于所述待训练特征融合模型对所述内容拼接特征和所述风格拼接特征融合处理,得到融合特征,并将所述融合特征输入待训练编译器中,得到实际输出图像;For the second training sample, the content splicing feature of the second original image is obtained based on the content feature extractor to be trained, and the style splicing feature of the second style image is obtained based on the style feature extractor to be trained, based on The feature fusion model to be trained fuses the content splicing features and the style splicing features to obtain fusion features, and inputs the fusion features into a compiler to be trained to obtain an actual output image;
    将所述实际输出图像和所述第二风格图像输入至所述目标判别器中,确定第一损失值,以及将所述实际输出图像和所述第二风格图像输入至所述目标风格比对器中,确定风格损失值;inputting the actual output image and the second style image into the target discriminator, determining a first loss value, and inputting the actual output image and the second style image into the target style comparison In the device, determine the style loss value;
    基于所述第一损失值和所述风格损失值,对所述待训练风格处理模型中的模型参数修正处理,将所述待训练风格处理模型中的损失函数收敛作为训练目标,得到所述目标待使用模型。Based on the first loss value and the style loss value, correcting the model parameters in the style processing model to be trained, taking the convergence of the loss function in the style processing model to be trained as a training target, and obtaining the target The model to be used.
  7. 根据权利要求5所述的方法,其中,所述确定与所述目标风格类型相对应的一幅参照图像,并基于所述参照图像和所述待使用风格模型,确定所述第一风格转换模型,包括:The method according to claim 5, wherein said determining a reference image corresponding to said target style type, and determining said first style transfer model based on said reference image and said style model to be used ,include:
    从至少一个待选择风格类型中确定目标风格类型;determining a target style type from at least one style type to be selected;
    获取与所述目标风格类型相一致的一幅参照图像,并将所述参照图像与所述待使用风格模型封装,得到所述第一风格转换模型。Acquiring a reference image consistent with the target style type, and encapsulating the reference image with the style model to be used to obtain the first style conversion model.
  8. 根据权利要求5所述的方法,还包括:The method according to claim 5, further comprising:
    获取第三原始图像在待选择风格类型下的待使用图像,对所述待使用图像裁剪处理,得到第三风格图像;Acquiring the image to be used of the third original image under the style type to be selected, and cutting the image to be used to obtain the image of the third style;
    将高斯噪声输入待训练图像生成模型中,得到第三输出图像;Gaussian noise is input in the image generation model to be trained, obtains the 3rd output image;
    基于第一判别器对所述第三输出图像和第三风格图像处理,确定损失值,并基于所述损失值对所述待训练图像生成模型中的模型参数进行修正;Processing the third output image and the third style image based on the first discriminator, determining a loss value, and correcting model parameters in the image generation model to be trained based on the loss value;
    将所述待训练图像生成模型中的损失函数收敛作为训练目标,得到所述目标图像生成模型。Taking the convergence of the loss function in the image generation model to be trained as the training target to obtain the target image generation model.
  9. 根据权利要求8所述的方法,还包括:The method of claim 8, further comprising:
    基于所述目标图像生成模型对高斯噪声进行处理,得到第二风格图像;processing Gaussian noise based on the target image generation model to obtain a second style image;
    基于预先训练生成的表情编辑模型,为所述第二风格图像进行表情添加,更新第二风格图像。Adding expression to the second style image based on the expression editing model generated through pre-training, and updating the second style image.
  10. 根据权利要求1-9任一所述的方法,其中,所述目标风格类型包括日系风格、韩系风格、古装风格、漫画风格或者预先设置的多种待选择风格类型。The method according to any one of claims 1-9, wherein the target style type includes Japanese style, Korean style, ancient costume style, comic style or multiple preset style types to be selected.
  11. 一种图像处理装置,包括:An image processing device, comprising:
    待处理图像采集模块,设置为获取包括目标主体的待处理图像;The image acquisition module to be processed is configured to acquire the image to be processed including the target subject;
    特效图像确定模块,设置为将所述待处理图像和所述目标主体的主体属性,输入至目标风格转换模型中,得到将所述目标主体转换为目标风格类型的目标特效图像;The special effect image determination module is configured to input the image to be processed and the subject attributes of the target subject into the target style conversion model to obtain a target special effect image that converts the target subject into a target style type;
    图像显示模块,设置为将所述目标特效图像于图像展示区域中显示。The image display module is configured to display the target special effect image in the image display area.
  12. 一种电子设备,包括:An electronic device comprising:
    一个或多个处理器;one or more processors;
    存储装置,设置为存储一个或多个程序,storage means configured to store one or more programs,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一所述图像处理方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the image processing method according to any one of claims 1-10.
  13. 一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-10中任一所述图像处理方法。A storage medium containing computer-executable instructions for performing the image processing method according to any one of claims 1-10 when executed by a computer processor.
PCT/CN2022/141815 2021-12-29 2022-12-26 Image processing method and apparatus, electronic device, and storage medium WO2023125374A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111641158.3 2021-12-29
CN202111641158.3A CN114331820A (en) 2021-12-29 2021-12-29 Image processing method, image processing device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023125374A1 true WO2023125374A1 (en) 2023-07-06

Family

ID=81017138

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/141815 WO2023125374A1 (en) 2021-12-29 2022-12-26 Image processing method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN114331820A (en)
WO (1) WO2023125374A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117611434A (en) * 2024-01-17 2024-02-27 腾讯科技(深圳)有限公司 Model training method, image style conversion method and device and electronic equipment

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114331820A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN114866706A (en) * 2022-06-01 2022-08-05 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN115145442A (en) * 2022-06-07 2022-10-04 杭州海康汽车软件有限公司 Environment image display method and device, vehicle-mounted terminal and storage medium
CN114926326A (en) * 2022-06-28 2022-08-19 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN116126182A (en) * 2022-09-08 2023-05-16 北京字跳网络技术有限公司 Special effect processing method and device, electronic equipment and storage medium
CN115249221A (en) * 2022-09-23 2022-10-28 阿里巴巴(中国)有限公司 Image processing method and device and cloud equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146825A (en) * 2018-10-12 2019-01-04 深圳美图创新科技有限公司 Photography style conversion method, device and readable storage medium storing program for executing
CN110598781A (en) * 2019-09-05 2019-12-20 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and storage medium
US20200134797A1 (en) * 2018-10-31 2020-04-30 Boe Technology Group Co., Ltd. Image style conversion method, apparatus and device
CN111402112A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN111784567A (en) * 2020-07-03 2020-10-16 北京字节跳动网络技术有限公司 Method, apparatus, electronic device, and computer-readable medium for converting an image
CN113780326A (en) * 2021-03-02 2021-12-10 北京沃东天骏信息技术有限公司 Image processing method and device, storage medium and electronic equipment
CN114331820A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108205813B (en) * 2016-12-16 2022-06-03 微软技术许可有限责任公司 Learning network based image stylization
US10891723B1 (en) * 2017-09-29 2021-01-12 Snap Inc. Realistic neural network based image style transfer
CN108961198B (en) * 2018-07-09 2021-06-08 中国海洋大学 Underwater image synthesis method of multi-grid generation countermeasure network and application thereof
CN110830706A (en) * 2018-08-08 2020-02-21 Oppo广东移动通信有限公司 Image processing method and device, storage medium and electronic equipment
CN111583097A (en) * 2019-02-18 2020-08-25 北京三星通信技术研究有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN109949213B (en) * 2019-03-15 2023-06-16 百度在线网络技术(北京)有限公司 Method and apparatus for generating image
US11625576B2 (en) * 2019-11-15 2023-04-11 Shanghai United Imaging Intelligence Co., Ltd. Systems and methods for image style transformation
CN112669308A (en) * 2021-01-06 2021-04-16 携程旅游信息技术(上海)有限公司 Image generation method, system, device and storage medium based on style migration
CN113705302A (en) * 2021-03-17 2021-11-26 腾讯科技(深圳)有限公司 Training method and device for image generation model, computer equipment and storage medium
CN113850712A (en) * 2021-09-03 2021-12-28 北京达佳互联信息技术有限公司 Training method of image style conversion model, and image style conversion method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146825A (en) * 2018-10-12 2019-01-04 深圳美图创新科技有限公司 Photography style conversion method, device and readable storage medium storing program for executing
US20200134797A1 (en) * 2018-10-31 2020-04-30 Boe Technology Group Co., Ltd. Image style conversion method, apparatus and device
CN110598781A (en) * 2019-09-05 2019-12-20 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111402112A (en) * 2020-03-09 2020-07-10 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN111784567A (en) * 2020-07-03 2020-10-16 北京字节跳动网络技术有限公司 Method, apparatus, electronic device, and computer-readable medium for converting an image
CN113780326A (en) * 2021-03-02 2021-12-10 北京沃东天骏信息技术有限公司 Image processing method and device, storage medium and electronic equipment
CN114331820A (en) * 2021-12-29 2022-04-12 北京字跳网络技术有限公司 Image processing method, image processing device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117611434A (en) * 2024-01-17 2024-02-27 腾讯科技(深圳)有限公司 Model training method, image style conversion method and device and electronic equipment

Also Published As

Publication number Publication date
CN114331820A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
WO2023125374A1 (en) Image processing method and apparatus, electronic device, and storage medium
US20220239882A1 (en) Interactive information processing method, device and medium
WO2023125361A1 (en) Character generation method and apparatus, electronic device, and storage medium
CN111669502B (en) Target object display method and device and electronic equipment
CN111399729A (en) Image drawing method and device, readable medium and electronic equipment
US20230421716A1 (en) Video processing method and apparatus, electronic device and storage medium
WO2023125379A1 (en) Character generation method and apparatus, electronic device, and storage medium
WO2019227429A1 (en) Method, device, apparatus, terminal, server for generating multimedia content
WO2023093897A1 (en) Image processing method and apparatus, electronic device, and storage medium
WO2023083142A1 (en) Sentence segmentation method and apparatus, storage medium, and electronic device
WO2023138560A1 (en) Stylized image generation method and apparatus, electronic device, and storage medium
WO2023109842A1 (en) Image presentation method and apparatus, and electronic device and storage medium
WO2022142875A1 (en) Image processing method and apparatus, electronic device, and storage medium
WO2023045710A1 (en) Multimedia display and matching methods and apparatuses, device and medium
WO2023040749A1 (en) Image processing method and apparatus, electronic device, and storage medium
WO2022171024A1 (en) Image display method and apparatus, and device and medium
US20240119082A1 (en) Method, apparatus, device, readable storage medium and product for media content processing
US11818491B2 (en) Image special effect configuration method, image recognition method, apparatus and electronic device
WO2023109829A1 (en) Image processing method and apparatus, electronic device, and storage medium
WO2022233223A1 (en) Image splicing method and apparatus, and device and medium
WO2023138549A1 (en) Image processing method and apparatus, and electronic device and storage medium
WO2023138498A1 (en) Method and apparatus for generating stylized image, electronic device, and storage medium
CN112785669B (en) Virtual image synthesis method, device, equipment and storage medium
JP2023538825A (en) Methods, devices, equipment and storage media for picture to video conversion
WO2023202543A1 (en) Character processing method and apparatus, and electronic device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22914628

Country of ref document: EP

Kind code of ref document: A1