CN109671125A - A kind of GAN network model that height merges and the method for realizing text generation image - Google Patents

A kind of GAN network model that height merges and the method for realizing text generation image Download PDF

Info

Publication number
CN109671125A
CN109671125A CN201811542578.4A CN201811542578A CN109671125A CN 109671125 A CN109671125 A CN 109671125A CN 201811542578 A CN201811542578 A CN 201811542578A CN 109671125 A CN109671125 A CN 109671125A
Authority
CN
China
Prior art keywords
unit
image
text
convolution
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811542578.4A
Other languages
Chinese (zh)
Other versions
CN109671125B (en
Inventor
宋井宽
陈岱渊
高联丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201811542578.4A priority Critical patent/CN109671125B/en
Publication of CN109671125A publication Critical patent/CN109671125A/en
Application granted granted Critical
Publication of CN109671125B publication Critical patent/CN109671125B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The present invention relates to deep learning fields, it discloses a kind of GAN network model of height fusion and the methods for realizing text generation image, it is smaller to solve generation picture size present in traditional technology, quality is lower, the unstable problem of network training process is effectively realized by the clear high-quality semantic image of input text generation.The GAN network model of height fusion in the present invention, comprising: text compiler, condition increase module, generator and three independent arbiters;Based on the GAN network model of the height fusion, the high quality RGB image of matched text semantic information is still produced in the case where only one generator and three independent arbiters.To advanced optimize generator network structure, the various sizes of characteristic pattern for making full use of network middle layer to generate, generator is in addition to using the Residual Generation block in residual error network, 64*64 feature of the pyramid network structure from low-dimensional is additionally used, semantic information higher-dimension 256*256 feature abundant is gradually generated to.

Description

A kind of GAN network model that height merges and the method for realizing text generation image
Technical field
The present invention relates to deep learning fields, and in particular to a kind of GAN network model of height fusion and realizes that text is raw At the method for image.
Background technique
Although there is more application scenarios in real life by text generation picture, such as image compiling, across modal data Generate etc., but there was only less correlative study to the task at this stage.Originally, the method for text generation picture uses GAN Network is as basic network topology, and generation picture size is smaller, and picture quality is lower, as GAN-INT-CLS [1] can be only generated 64*64 image;For increased in size, method later is trained step by step using multiple GAN networks, but these networks are logical Height is required often with the network structure for having complexity, and to computing hardware, results in network training process complexity, time-consuming, such as StackGAN [2], StackGAN++ [3], AttnGAN [4] these schemes are divided into two steps and individually train two depth networks, not only It is not a network end to end, increases complexity and entire training process is highly unstable.
Bibliography:
[1]Reed,Scott,et al.2016.Generative adversarial text to image synthesis.arXiv preprint arXiv:1605.05396
[2]Zhang,H.;Xu,T.;Li,H.;Zhang,S.;Huang,X.;Wang,X.;and Metaxas, D.2017a.Stackgan:Text to photorealistic image synthesis with stacked generative adversarial networks.arXiv preprint
[3]Zhang,H.;Xu,T.;Li,H.;Zhang,S.;Wang,X.;Huang,X.;and Metaxas, D.2017b.Stackgan++:Realistic image synthesis with stacked generative adversarial networks.arXiv:1710.10916.
[4]Xu,T.,Zhang,P.,Huang,Q.,Zhang,H.,Gan,Z.,Huang,X.,&He, X.2017.Attngan:Fine-grained text to image generation with attentional generative adversarial networks.arXiv preprint.
Summary of the invention
The technical problems to be solved by the present invention are: providing a kind of GAN network model of height fusion and realizing that text is raw At the method for image, generation picture size present in solution traditional technology is smaller, and quality is lower, and network training process is unstable The problem of, it effectively realizes by the clear high-quality semantic image of input text generation.
The technical proposal adopted by the invention to solve the above technical problems is that:
A kind of GAN network model of height fusion, comprising: it is only that text compiler, condition increase module, generator and three Vertical arbiter;
The text compiler, the feature representation compiled for the text output to input text compiler;
The condition increases module, for sampling out certain dimension from the compiled feature representation that text compiler exports The condition flag of degree is expressed, and is mutually spliced with noise in channel dimension again later, is input in generator network;
The generator, including full articulamentum, seven sequentially connected Residual Generation blocks being connected with full articulamentum, successively It is connected and corresponds three accumulations being connected with last three Residual Generation blocks and generates block;
The full articulamentum, the feature for increasing module output to condition carries out feature and rises dimension, and converts its shape to 4 Dimensional feature;
The Residual Generation block, for generating various sizes of feature;
The accumulation generates block, for being merged using pyramid network structure to various sizes of feature, thus raw At various sizes of RGB image;
Three independent arbiters generate block one-to-one correspondence with three accumulations of generator and are connected, for generation The quality of the various sizes of RGB image of device output is differentiated, and will differentiate that result returns to generator.
As advanced optimizing, perception loss function is provided in the generator, for improving the language of the image generated Adopted consistency and image diversity.
As advanced optimizing, it is provided in each arbiter for differentiating whether generation image and input text are semantic It is matched to match pairs of loss function and generate the whether local really topography's loss function of image for differentiating, and most It is additionally provided in the latter arbiter for the classification information loss function for generating image of classifying.
As advanced optimizing, the Residual Generation block includes that a up-sampling block, two 3*3 convolution units and one are tired Add device;The output signal of the input signal connection upper level Residual Generation block of the up-sampling block;The output of the up-sampling block Signal connects an input signal of accumulator, and the output signal for up-sampling block passes sequentially through the volume of two 3*3 convolution units Another input signal of accumulator is connected after product operation.
As advanced optimizing, it includes a 1*1 convolution unit, up-sampling block, two 3*3 that the accumulation, which generates block, Convolution unit and an accumulator;The output signal of the input signal connection Residual Generation block of the 1*1 convolution unit;The 1* The input signal of the output signal connection up-sampling block of 1 convolution unit;The output signal and upper level accumulation of the up-sampling block Generate two input signals of the output signal connection accumulator of block;The output signal of the accumulator passes through a 3*3 convolution Unit exports more high-dimensional feature, and the output signal of the 3*3 convolution unit connects the input signal of another 3*3 convolution unit, RGB image is exported by another described 3*3 convolution unit.
As advanced optimizing, three independent arbiters include: that the first arbiter, the second arbiter and third are sentenced Other device.
As advanced optimizing, first arbiter and the second arbiter include: multilayer convolutional network unit, first 4*4 convolution unit, the 2nd 4*4 convolution unit, the first full articulamentum, stereocopy unit, channel concatenation unit and 1*1 convolution list Member;The input signal connection accumulation of the multilayer convolutional network unit generates the RGB image of block output;The multilayer convolutional network One input signal of the output signal interface channel concatenation unit of unit, while office is connected to by the first 4*4 convolution unit Portion's image impairment function;The text feature expression of the input signal connection text compiler output of the first full articulamentum;Institute It states the output signal of the first full articulamentum and the another of interface channel concatenation unit after stereocopy is carried out by stereocopy unit A input signal;The output signal of the channel concatenation unit passes sequentially through the volume of 1*1 convolution unit and the 2nd 4*4 convolution unit The pairs of loss function of matching is connected to after product operation.
As advanced optimizing, the third arbiter includes: multilayer convolutional network unit, the first 4*4 convolution unit, Two 4*4 convolution units, the 3rd 4*4 convolution unit, the first full articulamentum, the second full articulamentum, stereocopy unit, channel splicing Unit and 1*1 convolution unit;The input signal connection accumulation of the multilayer convolutional network unit generates the RGB image of block output; One input signal of the output signal interface channel concatenation unit of the multilayer convolutional network unit, while passing through the first 4*4 Convolution unit is connected to topography's loss function;The input signal connection text compiler output of the first full articulamentum Text feature expression;The output signal of the first full articulamentum passes through interface channel after stereocopy unit progress stereocopy Another input signal of concatenation unit;The output signal of the channel concatenation unit passes sequentially through 1*1 convolution unit and second The pairs of loss function of matching is connected to after the convolution operation of 4*4 convolution unit;And the output signal of 1*1 convolution unit passes through third 4*4 convolution unit and the second full articulamentum are connected to classification information loss function.
In addition, the present invention also provides a kind of GAN network models based on the fusion of above-mentioned height to realize text generation image Method comprising following steps:
It inputs text into trained text compiler, exports compiled feature representation;
The condition flag expression that module samples go out certain dimension is increased using condition, later again with noise in channel dimension phase Splicing is input in generator network;
In generator network, feature liter dimension is carried out by full articulamentum and converts its shape to 4 dimensional features, is inputted later Into continuous seven Residual Generation blocks;The feature of the different dimensions of last three Residual Generation blocks output is input to pair again The accumulation answered generates in block, exports various sizes of RGB image by the convolution operation that accumulation generates block;
The various sizes of RGB image of generation is separately input into corresponding independent arbiter, independent arbiter is passed through The loss function of middle limitation differentiates the quality of image, passes through back-propagating to entire generator after calculating the gradient of image In network, and update the parameter of independent arbiter and entire generator network.
As advanced optimizing, the loss function by limiting in independent arbiter sentences the quality of image Not, it specifically includes: being differentiated by the pairs of loss function of matching limited in three independent arbiters and generate image and input text Word whether semantic matches, differentiated by topography's loss function of limitation and generate image whether part is true;In addition, for The last one independent arbiter in three independent arbiters, generation figure of also being classified by the classification information loss function of limitation Picture.
The beneficial effects of the present invention are:
1) the pyramid network structure for using for reference Fusion Features, effectively utilizes the intermediate features of depth network internal generation, Generate the high quality graphic feature of more matched text semanteme.
2) the generator structure of perception loss function optimization GAN network, the semantic letter of rich image feature are effectively utilized Breath.
3) a variety of differentiation loss functions are effectively utilized, such as: pairs of loss function, topography's loss function, classification damage It loses function and improves its discriminating power to optimize the arbiter structure of GAN network, further increase generation picture quality.
4) GAN network architecture proposed by the present invention is used, training process can be stablized, reduces the training time.
Detailed description of the invention
Fig. 1 is the GAN schematic network structure of the height fusion in the embodiment of the present invention;
Fig. 2 is the structural schematic diagram of Residual Generation block;
Fig. 3 is the structural schematic diagram that accumulation generates block;
Fig. 4 is the structural schematic diagram of arbiter.
Specific embodiment
The present invention is intended to provide a kind of GAN network model of height fusion and the method for realizing text generation image, solve Generation picture size present in traditional technology is smaller, and quality is lower, and the unstable problem of network training process effectively realizes By the clear high-quality semantic image of input text generation.
Core of the invention thought is: to reduce trained cost as far as possible, designing the GAN net of a high unity structuring Network model makes it still produce matched text semantic information in the case where only one generator and three independent arbiters High quality RGB image.To advanced optimize generator network structure, the various sizes of spy for making full use of network middle layer to generate Sign figure, generator is in addition to additionally using 64*64 of the pyramid network structure from low-dimensional using the Residual Generation block in residual error network Feature is gradually generated to semantic information higher-dimension 256*256 feature abundant.
Embodiment:
As shown in Figure 1, the GAN network model that the height in the present embodiment merges, comprising: text compiler, condition increase Module, generator and three independent arbiters;
The text compiler, the feature representation compiled for the text output to input text compiler;
The condition increases module, for sampling out certain dimension from the compiled feature representation that text compiler exports The condition flag of degree is expressed, and is mutually spliced with noise in channel dimension again later, is input in generator network;
The generator, including full articulamentum, seven sequentially connected Residual Generation blocks being connected with full articulamentum, successively It is connected and corresponds three accumulations being connected with last three Residual Generation blocks and generates block;
The full articulamentum, for condition increase module output feature carry out feature rise dimension, and change shape to 4 dimension Feature;
The Residual Generation block, for generating various sizes of feature;
The accumulation generates block, for being merged using pyramid network structure to various sizes of feature, thus raw At various sizes of RGB image;
Three independent arbiters generate block one-to-one correspondence with three accumulations of generator and are connected, for generation The quality of the various sizes of RGB image of device output is differentiated, and will differentiate that result returns to generator.
In specific implementation, the feature of each size is generated by Residual Generation block first.As shown in Fig. 2, a residual error is raw Blocking includes a up-sampling block, two 3*3 convolution units and an accumulator;In the input signal connection of the up-sampling block The output signal of level-one Residual Generation block;One input signal of the output signal connection accumulator of the up-sampling block, and on The output signal of sampling block passes sequentially through another input letter that accumulator is connected after the convolution operations of two 3*3 convolution units Number.
In the present invention in order to enrich the feature representation of each size, it is proposed that a kind of accumulation generates block, using pyramid Network structure merges different size characteristics.As shown in figure 3, it includes 1*1 convolution unit, one that an accumulation, which generates block, A up-sampling block, two 3*3 convolution units and an accumulator;The input signal of the 1*1 convolution unit connects Residual Generation The output signal of block;The input signal of the output signal connection up-sampling block of the 1*1 convolution unit;It is described to up-sample the defeated of block Signal connects two input signals of accumulator with the output signal that upper level accumulates generation block out;The output of the accumulator is believed Number more high-dimensional feature is exported by 3*3 convolution unit, the output signal of the 3*3 convolution unit connects another 3*3 volumes The input signal of product unit exports RGB image by another described 3*3 convolution unit.
The structure of arbiter in the present invention is as shown in figure 4, wherein dotted box portion is the exclusive knot of third arbiter Structure, rest part are the structure that three arbiters have;Each arbiter includes: multilayer convolutional network unit, the first 4*4 volumes Product unit, the 2nd 4*4 convolution unit, the first full articulamentum, stereocopy unit, channel concatenation unit and 1*1 convolution unit;Institute The input signal connection accumulation for stating multilayer convolutional network unit generates the RGB image of block output;The multilayer convolutional network unit Output signal interface channel concatenation unit an input signal, while Local map is connected to by the first 4*4 convolution unit As loss function;The text feature expression of the input signal connection text compiler output of the first full articulamentum;Described The output signal of one full articulamentum by stereocopy unit carry out interface channel concatenation unit after stereocopy another is defeated Enter signal;The output signal of the channel concatenation unit passes sequentially through the convolution behaviour of 1*1 convolution unit and the 2nd 4*4 convolution unit The pairs of loss function of matching is connected to after work.
In the training process, it is observed that generation image is smaller to the difference of inhomogeneity object, in order to further increase area Other degree, we are only to arbiter (third arbiter) the limitation classification letter of 256*256 large-size images in the present embodiment Cease loss function.The structure of third arbiter is on the basis of above structure as a result, further include: the second full articulamentum and third 4*4 convolution unit;The output signal of 1*1 convolution unit is connected to classification by the 3rd 4*4 convolution unit and the second full articulamentum Information loss function.
In addition, we are also by generator restriction aware loss function, come improve generate image semantic consistency and Image diversity.
Based on the GAN network model of the above-mentioned height fusion in embodiment, the model is utilized the present invention also provides a kind of The method for realizing text generation image comprising following implementation steps:
Step 1: inputting text into trained text compiler, export compiled feature representation.But it is special at this time The sign higher network that is unfavorable for of dimension acquires accurate mapping, increases the condition flag table that module samples go out appropriate dimension using condition It reaches, mutually splices with noise in channel dimension be again input in generator network later.Its conditional increases module and is based on variation certainly Encoder VAE (Variational Auto-Encoder) theory constructs, in order to enable random point that conditional-variable is constructed Cloth is sufficiently close to standard gaussian distribution, we increase module to condition and limit KL divergence loss function.
Step 2: feature liter dimension being carried out by full articulamentum and converts its shape to 4 dimensional features, is input to continuous 7 later In a Residual Generation block, in order to sufficiently enhance the feature representation of different dimensions, the present invention respectively by 64*64,128*128, The feature of tri- kinds of different dimensions of 256*256 is input to accumulation again and generates in block, exports different sizes by convolution operation again later RGB image.
For the image restriction aware loss function of 256*256 size, calculates its gradient and updated entirely by back-propagating Generator network.
Step 3: the quality in order to guarantee each sized image is followed by an independent arbiter to each size.In addition to In addition 256*256 image limits classification information loss function, all sizes all limit the pairs of loss function of matching, topography's damage Lose function.During propagated forward, the present invention can disposably generate three kinds of various sizes of RGB images, be output to In corresponding independent arbiter, allow arbiter be based on matching pairs of loss function differentiate generate image and input text whether language Justice matching, differentiating generation image based on topography's loss function, whether part is true, is classified based on classification information loss function Generate image.During back-propagating, three arbiters calculate its gradient and are passed back in entire generator together again, update three The parameter of a independent arbiter and entire generator network.

Claims (10)

1. a kind of GAN network model of height fusion characterized by comprising text compiler, condition increase module, generate Device and three independent arbiters;
The text compiler, the feature representation compiled for the text output to input text compiler;
The condition increases module, for sampling out certain dimension from the compiled feature representation that text compiler exports Condition flag expression, mutually splices again in channel dimension with noise later, is input in generator network;
The generator including full articulamentum, seven sequentially connected Residual Generation blocks being connected with full articulamentum, is sequentially connected And three accumulations being connected are corresponded with last three Residual Generation blocks and generate block;
The full articulamentum, the feature for increasing module output to condition carries out feature and rises dimension, and change shape is to 4 Wei Te Sign;
The Residual Generation block, for generating various sizes of feature;
The accumulation generates block, for being merged using pyramid network structure to various sizes of feature, to generate not With the RGB image of size;
Three independent arbiters generate block one-to-one correspondence with three accumulations of generator and are connected, for defeated to generator The quality of various sizes of RGB image out is differentiated, and will differentiate that result returns to generator.
2. a kind of GAN network model of height fusion as described in claim 1, which is characterized in that be arranged in the generator There is perception loss function, for improving the semantic consistency and image diversity of the image generated.
3. a kind of GAN network model of height fusion as described in claim 1, which is characterized in that be all provided in each arbiter Be equipped with for differentiate generate image and input text whether semantic matches the pairs of loss function of matching and for differentiate generate The whether local true topography loss function of image, and be additionally provided in the last one arbiter and generate image for classifying Classification information loss function.
4. a kind of GAN network model of height fusion as described in claim 1, which is characterized in that the Residual Generation block packet Include a up-sampling block, two 3*3 convolution units and an accumulator;The input signal connection upper level of the up-sampling block is residual The blocking output signal of bad student;One input signal of the output signal connection accumulator of the up-sampling block, and up-sample block Output signal pass sequentially through another input signal that accumulator is connected after the convolution operations of two 3*3 convolution units.
5. a kind of GAN network model of height fusion as described in claim 1, which is characterized in that the accumulation generates block packet Include a 1*1 convolution unit, a up-sampling block, two 3*3 convolution units and an accumulator;The 1*1 convolution unit The output signal of input signal connection Residual Generation block;The input of the output signal connection up-sampling block of the 1*1 convolution unit Signal;The output signal of the up-sampling block connects two inputs letter of accumulator with the output signal that upper level accumulates generation block Number;The output signal of the accumulator exports more high-dimensional feature by 3*3 convolution unit, the 3*3 convolution unit it is defeated Signal connects the input signal of another 3*3 convolution unit out, exports RGB image by another described 3*3 convolution unit.
6. a kind of GAN network model of height fusion as described in claim 1, which is characterized in that described three independent to sentence Other device includes: the first arbiter, the second arbiter and third arbiter.
7. a kind of GAN network model of height fusion as claimed in claim 6, which is characterized in that first arbiter and Second arbiter includes: multilayer convolutional network unit, the first 4*4 convolution unit, the 2nd 4*4 convolution unit, the first full connection Layer, stereocopy unit, channel concatenation unit and 1*1 convolution unit;The input signal of the multilayer convolutional network unit connects Accumulation generates the RGB image of block output;One of the output signal interface channel concatenation unit of the multilayer convolutional network unit Input signal, while topography's loss function is connected to by the first 4*4 convolution unit;The input of the first full articulamentum Signal connects the text feature expression of text compiler output;The output signal of the first full articulamentum passes through stereocopy list Member carries out another input signal of interface channel concatenation unit after stereocopy;The output signal of the channel concatenation unit according to It is secondary to match pairs of loss function by being connected to after the convolution operation of 1*1 convolution unit and the 2nd 4*4 convolution unit.
8. a kind of GAN network model of height fusion as claimed in claim 6, which is characterized in that the third arbiter packet Include: multilayer convolutional network unit, the first 4*4 convolution unit, the 2nd 4*4 convolution unit, the 3rd 4*4 convolution unit, first connect entirely Connect layer, the second full articulamentum, stereocopy unit, channel concatenation unit and 1*1 convolution unit;The multilayer convolutional network unit Input signal connection accumulation generate block output RGB image;The output signal interface channel of the multilayer convolutional network unit One input signal of concatenation unit, while topography's loss function is connected to by the first 4*4 convolution unit;Described first The text feature expression of the input signal connection text compiler output of full articulamentum;The output signal of the first full articulamentum Another input signal of interface channel concatenation unit after stereocopy is carried out by stereocopy unit;The channel splicing is single The output signal of member is connected to matching damage in pairs after passing sequentially through the convolution operation of 1*1 convolution unit and the 2nd 4*4 convolution unit Lose function;And the output signal of 1*1 convolution unit is connected to classification information by the 3rd 4*4 convolution unit and the second full articulamentum Loss function.
9. the method for realizing text generation image, which is characterized in that melted using the height as described in claim 1-8 any one The GAN network model of conjunction handles text, comprising the following steps:
It inputs text into trained text compiler, exports compiled feature representation;
The condition flag expression that module samples go out certain dimension is increased using condition, is mutually spliced with noise in channel dimension again later It is input in generator network;
In generator network, passes through full articulamentum and carry out feature and rise to tie up and convert its shape to 4 dimensional features, later the company of being input to In seven continuous Residual Generation blocks;The feature of the different dimensions of last three Residual Generation blocks output is input to again corresponding Accumulation generates in block, exports various sizes of RGB image by the convolution operation that accumulation generates block;
The various sizes of RGB image of generation is separately input into corresponding independent arbiter, by being limited in independent arbiter The loss function of system differentiates the quality of image, passes through back-propagating to entire generator network after calculating the gradient of image In, and update the parameter of independent arbiter and entire generator network.
10. realizing the method for text generation image as claimed in claim 9, which is characterized in that
The loss function by limiting in independent arbiter differentiates the quality of image, specifically includes: by three The pairs of loss function of the matching limited in independent arbiter come differentiate generate image and input text whether semantic matches, pass through limit Whether part is true to differentiate generation image for topography's loss function of system;In addition, in arbiter independent for three most The latter independence arbiter also generates image by the classification information loss function of limitation to classify.
CN201811542578.4A 2018-12-17 2018-12-17 Highly-integrated GAN network device and method for realizing text image generation Active CN109671125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811542578.4A CN109671125B (en) 2018-12-17 2018-12-17 Highly-integrated GAN network device and method for realizing text image generation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811542578.4A CN109671125B (en) 2018-12-17 2018-12-17 Highly-integrated GAN network device and method for realizing text image generation

Publications (2)

Publication Number Publication Date
CN109671125A true CN109671125A (en) 2019-04-23
CN109671125B CN109671125B (en) 2023-04-07

Family

ID=66144473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811542578.4A Active CN109671125B (en) 2018-12-17 2018-12-17 Highly-integrated GAN network device and method for realizing text image generation

Country Status (1)

Country Link
CN (1) CN109671125B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163267A (en) * 2019-05-09 2019-08-23 厦门美图之家科技有限公司 A kind of method that image generates the training method of model and generates image
CN110335212A (en) * 2019-06-28 2019-10-15 西安理工大学 Defect ancient books Chinese character restorative procedure based on condition confrontation network
CN110572696A (en) * 2019-08-12 2019-12-13 浙江大学 variational self-encoder and video generation method combining generation countermeasure network
CN110717555A (en) * 2019-12-12 2020-01-21 江苏联著实业股份有限公司 Picture generation system and device based on natural language and generation countermeasure network
CN110909181A (en) * 2019-09-30 2020-03-24 中国海洋大学 Cross-modal retrieval method and system for multi-type ocean data
CN110930469A (en) * 2019-10-25 2020-03-27 北京大学 Text image generation method and system based on transition space mapping
CN111858882A (en) * 2020-06-24 2020-10-30 贵州大学 Text visual question-answering system and method based on concept interaction and associated semantics
CN111898461A (en) * 2020-07-08 2020-11-06 贵州大学 Time sequence behavior segment generation method
CN113140019A (en) * 2021-05-13 2021-07-20 电子科技大学 Method for generating text-generated image of confrontation network based on fusion compensation

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2017101166A4 (en) * 2017-08-25 2017-11-02 Lai, Haodong MR A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
CN107392973A (en) * 2017-06-06 2017-11-24 中国科学院自动化研究所 Pixel-level handwritten Chinese character automatic generation method, storage device, processing unit
US20170351935A1 (en) * 2016-06-01 2017-12-07 Mitsubishi Electric Research Laboratories, Inc Method and System for Generating Multimodal Digital Images
CN107862377A (en) * 2017-11-14 2018-03-30 华南理工大学 A kind of packet convolution method that confrontation network model is generated based on text image
CN108416752A (en) * 2018-03-12 2018-08-17 中山大学 A method of image is carried out based on production confrontation network and removes motion blur
CN108460812A (en) * 2018-04-04 2018-08-28 北京红云智胜科技有限公司 A kind of expression packet generation system and method based on deep learning
CN108460717A (en) * 2018-03-14 2018-08-28 儒安科技有限公司 A kind of image generating method of the generation confrontation network based on double arbiters
CN108508629A (en) * 2017-02-26 2018-09-07 瑞云诺伐有限公司 Intelligent contact eyeglass and method with eyes driving control system
CN108510532A (en) * 2018-03-30 2018-09-07 西安电子科技大学 Optics and SAR image registration method based on depth convolution GAN
US20180260957A1 (en) * 2017-03-08 2018-09-13 Siemens Healthcare Gmbh Automatic Liver Segmentation Using Adversarial Image-to-Image Network
CN108537742A (en) * 2018-03-09 2018-09-14 天津大学 A kind of panchromatic sharpening method of remote sensing images based on generation confrontation network
CN108596265A (en) * 2018-05-02 2018-09-28 中山大学 Model is generated based on text description information and the video for generating confrontation network
CN108765319A (en) * 2018-05-09 2018-11-06 大连理工大学 A kind of image de-noising method based on generation confrontation network

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351935A1 (en) * 2016-06-01 2017-12-07 Mitsubishi Electric Research Laboratories, Inc Method and System for Generating Multimodal Digital Images
CN108508629A (en) * 2017-02-26 2018-09-07 瑞云诺伐有限公司 Intelligent contact eyeglass and method with eyes driving control system
US20180260957A1 (en) * 2017-03-08 2018-09-13 Siemens Healthcare Gmbh Automatic Liver Segmentation Using Adversarial Image-to-Image Network
CN107392973A (en) * 2017-06-06 2017-11-24 中国科学院自动化研究所 Pixel-level handwritten Chinese character automatic generation method, storage device, processing unit
AU2017101166A4 (en) * 2017-08-25 2017-11-02 Lai, Haodong MR A Method For Real-Time Image Style Transfer Based On Conditional Generative Adversarial Networks
CN107862377A (en) * 2017-11-14 2018-03-30 华南理工大学 A kind of packet convolution method that confrontation network model is generated based on text image
CN108537742A (en) * 2018-03-09 2018-09-14 天津大学 A kind of panchromatic sharpening method of remote sensing images based on generation confrontation network
CN108416752A (en) * 2018-03-12 2018-08-17 中山大学 A method of image is carried out based on production confrontation network and removes motion blur
CN108460717A (en) * 2018-03-14 2018-08-28 儒安科技有限公司 A kind of image generating method of the generation confrontation network based on double arbiters
CN108510532A (en) * 2018-03-30 2018-09-07 西安电子科技大学 Optics and SAR image registration method based on depth convolution GAN
CN108460812A (en) * 2018-04-04 2018-08-28 北京红云智胜科技有限公司 A kind of expression packet generation system and method based on deep learning
CN108596265A (en) * 2018-05-02 2018-09-28 中山大学 Model is generated based on text description information and the video for generating confrontation network
CN108765319A (en) * 2018-05-09 2018-11-06 大连理工大学 A kind of image de-noising method based on generation confrontation network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HUANG, HUALONG .ET AL: "Stereo Matching Using Conditional Adversarial Networks", 《LECTURE NOTES IN COMPUTER SCIENCE》 *
PAULINA HENSMAN.ET AL: "cGAN-Based Manga Colorization Using a Single Training Image", 《2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)》 *
张玉清等: "深度学习应用于网络空间安全的现状、趋势与展望", 《计算机研究与发展》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163267A (en) * 2019-05-09 2019-08-23 厦门美图之家科技有限公司 A kind of method that image generates the training method of model and generates image
CN110335212B (en) * 2019-06-28 2021-01-15 西安理工大学 Defect ancient book Chinese character repairing method based on condition confrontation network
CN110335212A (en) * 2019-06-28 2019-10-15 西安理工大学 Defect ancient books Chinese character restorative procedure based on condition confrontation network
CN110572696A (en) * 2019-08-12 2019-12-13 浙江大学 variational self-encoder and video generation method combining generation countermeasure network
CN110572696B (en) * 2019-08-12 2021-04-20 浙江大学 Variational self-encoder and video generation method combining generation countermeasure network
CN110909181A (en) * 2019-09-30 2020-03-24 中国海洋大学 Cross-modal retrieval method and system for multi-type ocean data
CN110930469A (en) * 2019-10-25 2020-03-27 北京大学 Text image generation method and system based on transition space mapping
CN110930469B (en) * 2019-10-25 2021-11-16 北京大学 Text image generation method and system based on transition space mapping
CN110717555A (en) * 2019-12-12 2020-01-21 江苏联著实业股份有限公司 Picture generation system and device based on natural language and generation countermeasure network
CN111858882A (en) * 2020-06-24 2020-10-30 贵州大学 Text visual question-answering system and method based on concept interaction and associated semantics
CN111858882B (en) * 2020-06-24 2022-08-09 贵州大学 Text visual question-answering system and method based on concept interaction and associated semantics
CN111898461A (en) * 2020-07-08 2020-11-06 贵州大学 Time sequence behavior segment generation method
CN113140019A (en) * 2021-05-13 2021-07-20 电子科技大学 Method for generating text-generated image of confrontation network based on fusion compensation
CN113140019B (en) * 2021-05-13 2022-05-31 电子科技大学 Method for generating text-generated image of confrontation network based on fusion compensation

Also Published As

Publication number Publication date
CN109671125B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN109671125A (en) A kind of GAN network model that height merges and the method for realizing text generation image
CN107480206B (en) Multi-mode low-rank bilinear pooling-based image content question-answering method
CN110097550B (en) Medical image segmentation method and system based on deep learning
CN111260740B (en) Text-to-image generation method based on generation countermeasure network
CN109903223A (en) A kind of image super-resolution method based on dense connection network and production confrontation network
CN110377686A (en) A kind of address information Feature Extraction Method based on deep neural network model
CN109543502A (en) A kind of semantic segmentation method based on the multiple dimensioned neural network of depth
CN107679491A (en) A kind of 3D convolutional neural networks sign Language Recognition Methods for merging multi-modal data
CN108710906B (en) Real-time point cloud model classification method based on lightweight network LightPointNet
CN110070107A (en) Object identification method and device
CN108765512B (en) Confrontation image generation method based on multi-level features
CN109978021B (en) Double-flow video generation method based on different feature spaces of text
CN113255813B (en) Multi-style image generation method based on feature fusion
CN110020681A (en) Point cloud feature extracting method based on spatial attention mechanism
CN109508400A (en) Picture and text abstraction generating method
CN110443173A (en) A kind of instance of video dividing method and system based on inter-frame relation
CN111797814A (en) Unsupervised cross-domain action recognition method based on channel fusion and classifier confrontation
CN108664885A (en) Human body critical point detection method based on multiple dimensioned Cascade H ourGlass networks
CN107145514A (en) Chinese sentence pattern sorting technique based on decision tree and SVM mixed models
CN114118012A (en) Method for generating personalized fonts based on cycleGAN
CN108154156A (en) Image Ensemble classifier method and device based on neural topic model
CN110516724A (en) Visualize the high-performance multilayer dictionary learning characteristic image processing method of operation scene
Liang Intelligent emotion evaluation method of classroom teaching based on expression recognition
CN110210419A (en) The scene Recognition system and model generating method of high-resolution remote sensing image
CN108932484A (en) A kind of facial expression recognizing method based on Capsule Net

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant