CN113793403B - Text image synthesizing method for simulating painting process - Google Patents

Text image synthesizing method for simulating painting process Download PDF

Info

Publication number
CN113793403B
CN113793403B CN202110953553.9A CN202110953553A CN113793403B CN 113793403 B CN113793403 B CN 113793403B CN 202110953553 A CN202110953553 A CN 202110953553A CN 113793403 B CN113793403 B CN 113793403B
Authority
CN
China
Prior art keywords
information
text
image
foreground
synthesizing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110953553.9A
Other languages
Chinese (zh)
Other versions
CN113793403A (en
Inventor
俞文心
张志强
戚原瑞
吴筱迪
刘露
龚俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN202110953553.9A priority Critical patent/CN113793403B/en
Publication of CN113793403A publication Critical patent/CN113793403A/en
Application granted granted Critical
Publication of CN113793403B publication Critical patent/CN113793403B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses a text synthetic image method for simulating a drawing process, which comprises the steps of inputting descriptive text information and synthesizing corresponding contour information based on the text information; synthesizing foreground information of the image based on the synthesized contour information in combination with the input text information; synthesizing foreground information and synthesizing corresponding background information by combining the text information input at the beginning; and synthesizing a final image result by using the obtained foreground information, background information and text information. The method can be used in all image synthesis algorithms based on text information to further improve the quality of image synthesis, greatly improves the practicability of the image synthesis technology, and can better promote the development of the text synthesis image technology and better popularize the application of image synthesis software.

Description

Text image synthesizing method for simulating painting process
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a text image synthesizing method for simulating a painting process.
Background
The use of text information to synthesize images has gained widespread attention in recent years in the field of image synthesis for computer vision. The main reason is that the text information can better describe the basic content of the image to be synthesized, and meanwhile, the text information accords with the daily input habit of people. Using text has better flexibility than image synthesis techniques using simple class labels. Therefore, the image synthesis technology based on the text information can better promote the development of image synthesis software towards the interface friendly direction, and a user can input corresponding text information according to personal requirements so as to synthesize an image result meeting subjective willingness. The method has good promotion effect on improving the practicability of the image synthesis technology and popularizing image synthesis software.
In the existing image synthesis technology, the synthetic image quality performance is excellent, and the method is based on deep learning. Most of the existing methods, although good in quality of synthesis, do not have good practicality. Some methods are capable of synthesizing corresponding images based on image class labels, which improves the practicality of the technology to some extent. Less information provided by category labels results in an overall lack of utility. The existing method with better practicability is to use text information to synthesize images. The current text-to-image technology has the problem that it is a one-time direct-to-image foreground and background content, lacking reasonable image-synthesizing steps. This results in a relatively general quality of the current text-synthesized image, with a large overall improvement.
Disclosure of Invention
In order to solve the problems, the invention provides a text synthesis image method for simulating a drawing process, which can be used in all image synthesis algorithms based on text information to further improve the quality of image synthesis, thereby greatly improving the practicability of an image synthesis technology, better promoting the development of the text synthesis image technology and better popularizing the application of image synthesis software.
In order to achieve the above purpose, the invention adopts the following technical scheme: a text-to-image method for simulating a painting process, comprising the steps of:
s10, inputting described text information, and synthesizing corresponding contour information based on the text information;
s20, synthesizing foreground information of the image based on the synthesized contour information and the input text information;
s30, synthesizing foreground information and synthesizing corresponding background information by combining the text information input at the beginning;
s40, synthesizing a final image result by using the obtained foreground information, background information and text information.
Further, in the step S10, for the processing of the input text information, the text information is encoded into corresponding text vectors using a text encoder, and then the text vectors are encoded into corresponding contour information using a continuous deconvolution operation.
Further, in the step S20, when the foreground information of the image is synthesized based on the synthesized contour information in combination with the input text information, the contour information is encoded into corresponding feature vectors through the convolutional neural network, and the feature vectors of the contour information and the text vectors are synthesized into the foreground information of the image through the deconvolution operation.
Further, in the step S30, when the foreground information is synthesized and the corresponding background information is synthesized by combining the text information input at the beginning, the foreground information is encoded into the corresponding feature vector by the convolutional neural network, and the feature vector and the text vector of the foreground information are inferred and synthesized to form the matched background information by the predictive synthesis operation.
Further, in the step S40, when the obtained foreground information, background information and text information are used to synthesize a final image result, the background information is encoded into corresponding feature vectors through the convolutional neural network, and the feature vectors of the foreground information, the feature vectors of the background information and the text vectors are synthesized into the final image result through deconvolution operation.
The beneficial effect of adopting this technical scheme is:
the whole image is sequentially de-synthesized from the outline to the foreground and then to the background in the whole process of the invention. This refinement of the task at each stage from a simple to a progressively more complex synthesis process allows each stage to focus more on its own tasks, so that the task performance at each stage can be better. In this case, it is possible to better ensure that the image result of high quality is finally synthesized. The text information participates in the whole process so as to ensure that the finally synthesized image can accord with the semantic information of the input text.
The invention firstly presumes and synthesizes corresponding simple contour information based on the text, then synthesizes corresponding foreground results based on the synthesized contour information, and finally synthesizes corresponding background information and final image results based on the foreground content. The image synthesis flow is easy to get, from simple outline to foreground to final image, and the process is similar to drawing, and the more true and reliable image is synthesized step by step. The more reasonable image synthesis flow can better promote the development of text synthesis image technology and better popularize the application of image synthesis software.
The method can be used in all image synthesis algorithms based on text information to further improve the quality of image synthesis, greatly improves the practicability of the image synthesis technology, and can better promote the development of the image synthesis technology and the popularization of related image synthesis software.
Drawings
FIG. 1 is a schematic flow diagram of a method for synthesizing text images for simulating a painting process according to the present invention;
fig. 2 is a schematic diagram of a text-to-image method for simulating a painting process according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present invention more apparent.
In this embodiment, referring to fig. 1 and 2, the present invention provides a method for synthesizing a text image for simulating a painting process, including the steps of:
s10, inputting described text information, and synthesizing corresponding contour information based on the text information;
s20, synthesizing foreground information of the image based on the synthesized contour information and the input text information;
s30, synthesizing foreground information and synthesizing corresponding background information by combining the text information input at the beginning;
s40, synthesizing a final image result by using the obtained foreground information, background information and text information.
As an optimization of the above embodiment, in the step S10, for the processing of the input text information, the text information is encoded into corresponding text vectors using a text encoder, and then the text vectors are encoded into corresponding contour information using a continuous deconvolution operation.
The specific process formula is as follows:
text vector
Profile information: i c =deconvolution(s);
Wherein T represents the input text information;representing a text encoder; deconvolution represents a deconvolution operation.
As an optimization scheme of the above embodiment, in the step S20, when the foreground information of the image is synthesized based on the synthesized contour information in combination with the input text information, the contour information is encoded into the corresponding feature vector by the convolutional neural network, and the feature vector and the text vector of the contour information are synthesized into the foreground information of the image by the deconvolution operation.
The specific process formula is as follows:
feature vector of contour information: fea_c=cnn (I c );
Foreground information: i f =deconvolution(fea_c,s);
Wherein CNN is convolutional neural network; deconvolution represents a deconvolution operation.
As an optimization scheme of the above embodiment, in the step S30, when the foreground information is synthesized and the corresponding background information is synthesized by combining the text information input at the beginning, the foreground information is encoded into the corresponding feature vector by the convolutional neural network, and the feature vector and the text vector of the foreground information are inferred and synthesized to form the matched background information by the predictive synthesis operation.
The specific process formula is as follows:
feature vector of foreground information:fea_f=CNN(I f );
background information: i b =prediction(fea_f,s);
Wherein CNN is convolutional neural network; prediction represents a predictive synthesis operation.
As an optimization scheme of the above embodiment, in the step S40, when the obtained foreground information, background information and text information are used to synthesize a final image result, the background information is encoded into corresponding feature vectors through a convolutional neural network, and the feature vectors of the foreground information, the feature vectors of the background information and the text vectors are synthesized into the final image result through deconvolution operation.
The specific process formula is as follows:
feature vector of background information: fea_f=cnn (I b );
Final image results: i g =deconvoluotion(fea_f,fea_b,s);
Wherein CNN is convolutional neural network; deconvolution represents a deconvolution operation.
Specific embodiments may employ:
1. text-to-image system
Providing a web page interface similar to hundred-degree translation, allowing Xu Renwei to input text information in the interface, and then clicking a synthesis button to generate a corresponding image result. Thereby obtaining the image result which accords with the subjective intention of people.
2. Text composition image software
The software comprises two parts: and (5) synthesizing an image result and displaying an image process result.
The text synthesis image software formed by the invention allows a user to input text information in the software, and then the software can automatically synthesize corresponding images. Meanwhile, the software can also display the staged results, wherein the staged results comprise contour information, foreground content and background content generated in the synthesis process. The software may be used in computer structured aided design.
The foregoing has shown and described the basic principles and main features of the present invention and the advantages of the present invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (1)

1. A method for synthesizing an image from text for simulating a painting process, comprising the steps of:
s10, inputting described text information, and synthesizing corresponding contour information based on the text information;
in the step S10, for the processing of the input text information, the text information is encoded into corresponding text vectors using a text encoder, and then the text vectors are encoded into corresponding contour information using a continuous deconvolution operation;
s20, synthesizing foreground information of the image based on the synthesized contour information and the input text information;
in the step S20, when the foreground information of the image is synthesized based on the synthesized contour information in combination with the input text information, the contour information is encoded into corresponding feature vectors through a convolutional neural network, and the feature vectors and the text vectors of the contour information are synthesized into the foreground information of the image through deconvolution operation;
s30, synthesizing foreground information and synthesizing corresponding background information by combining the text information input at the beginning;
in the step S30, when the foreground information is synthesized and the corresponding background information is synthesized by combining the text information input at the beginning, the foreground information is encoded into the corresponding feature vector through the convolutional neural network, and the feature vector and the text vector of the foreground information are inferred and synthesized to form the matched background information through the predictive synthesis operation;
s40, synthesizing a final image result by using the obtained foreground information, background information and text information;
in the step S40, when the obtained foreground information, background information and text information are used to synthesize a final image result, the background information is encoded into corresponding feature vectors through the convolutional neural network, and the feature vectors of the foreground information, the feature vectors of the background information and the text vectors are synthesized into the final image result through deconvolution operation.
CN202110953553.9A 2021-08-19 2021-08-19 Text image synthesizing method for simulating painting process Active CN113793403B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110953553.9A CN113793403B (en) 2021-08-19 2021-08-19 Text image synthesizing method for simulating painting process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110953553.9A CN113793403B (en) 2021-08-19 2021-08-19 Text image synthesizing method for simulating painting process

Publications (2)

Publication Number Publication Date
CN113793403A CN113793403A (en) 2021-12-14
CN113793403B true CN113793403B (en) 2023-09-22

Family

ID=79182069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110953553.9A Active CN113793403B (en) 2021-08-19 2021-08-19 Text image synthesizing method for simulating painting process

Country Status (1)

Country Link
CN (1) CN113793403B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011112522A2 (en) * 2010-03-10 2011-09-15 Microsoft Corporation Text enhancement of a textual image undergoing optical character recognition
CN102724554A (en) * 2012-07-02 2012-10-10 西南科技大学 Scene-segmentation-based semantic watermark embedding method for video resource
CN105184074A (en) * 2015-09-01 2015-12-23 哈尔滨工程大学 Multi-modal medical image data model based medical data extraction and parallel loading method
CN107305696A (en) * 2016-04-22 2017-10-31 阿里巴巴集团控股有限公司 A kind of image generating method and device
CN107895393A (en) * 2017-10-24 2018-04-10 天津大学 A kind of story image sequence generation method of comprehensive word and shape
CN111507328A (en) * 2020-04-13 2020-08-07 北京爱咔咔信息技术有限公司 Text recognition and model training method, system, equipment and readable storage medium
CN112734881A (en) * 2020-12-01 2021-04-30 北京交通大学 Text synthesis image method and system based on significance scene graph analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9319640B2 (en) * 2009-12-29 2016-04-19 Kodak Alaris Inc. Camera and display system interactivity
US20200285855A1 (en) * 2017-06-05 2020-09-10 Umajin Inc. Hub and spoke classification system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011112522A2 (en) * 2010-03-10 2011-09-15 Microsoft Corporation Text enhancement of a textual image undergoing optical character recognition
CN102724554A (en) * 2012-07-02 2012-10-10 西南科技大学 Scene-segmentation-based semantic watermark embedding method for video resource
CN105184074A (en) * 2015-09-01 2015-12-23 哈尔滨工程大学 Multi-modal medical image data model based medical data extraction and parallel loading method
CN107305696A (en) * 2016-04-22 2017-10-31 阿里巴巴集团控股有限公司 A kind of image generating method and device
CN107895393A (en) * 2017-10-24 2018-04-10 天津大学 A kind of story image sequence generation method of comprehensive word and shape
CN111507328A (en) * 2020-04-13 2020-08-07 北京爱咔咔信息技术有限公司 Text recognition and model training method, system, equipment and readable storage medium
CN112734881A (en) * 2020-12-01 2021-04-30 北京交通大学 Text synthesis image method and system based on significance scene graph analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Text to Image Synthesis Using Two-Stage Generation and Two-Stage Discrimination;Zhiqiang Zhang .etc;《International Conference on Knowledge Science,Engineering and Management》;第第11776卷卷;第110-114页,图1-2 *
基于深度学习的图文转换算法研究;张志强;《中国优秀硕士学位论文全文数据库(电子期刊)》;I138-423 *

Also Published As

Publication number Publication date
CN113793403A (en) 2021-12-14

Similar Documents

Publication Publication Date Title
Su et al. Mangagan: Unpaired photo-to-manga translation based on the methodology of manga drawing
Ren et al. Two-stage sketch colorization with color parsing
Qiao et al. Efficient style-corpus constrained learning for photorealistic style transfer
Zhang et al. A survey on multimodal-guided visual content synthesis
Wu et al. Human–machine hybrid intelligence for the generation of car frontal forms
Liu et al. Decoupled representation learning for character glyph synthesis
Lyu et al. Dran: detailed region-adaptive normalization for conditional image synthesis
CN113793403B (en) Text image synthesizing method for simulating painting process
Liu et al. Any-to-any style transfer: Making picasso and da vinci collaborate
Zeng et al. An unsupervised font style transfer model based on generative adversarial networks
Rao et al. UMFA: a photorealistic style transfer method based on U-Net and multi-layer feature aggregation
Wang et al. Coloring anime line art videos with transformation region enhancement network
Liu et al. Bi-lstm sequence modeling for on-the-fly fine-grained sketch-based image retrieval
Li et al. A review on neural style transfer
Gao et al. Segmentation-based background-inference and small-person pose estimation
Jiang et al. Emotion recognition using brain-computer interfaces and advanced artificial intelligence
Chi et al. DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields
Xing et al. Stylized Image Generation based on Music-image Synesthesia Emotional Style Transfer using CNN Network.
Liao et al. LOVECon: Text-driven Training-Free Long Video Editing with ControlNet
CN113793404B (en) Manual controllable image synthesis method based on text and contour
Song et al. Virtual Human Talking-Head Generation
Ye Generative adversarial networks
Wu et al. A text-driven image style transfer model based on CLIP and SCBAM
Liu et al. 3D shape completion via deep learning: a method survey
Guo et al. 3D face cartoonizer: Generating personalized 3D cartoon faces from 2D real photos with a hybrid dataset

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant