CN107330444A - A kind of image autotext mask method based on generation confrontation network - Google Patents

A kind of image autotext mask method based on generation confrontation network Download PDF

Info

Publication number
CN107330444A
CN107330444A CN201710396148.5A CN201710396148A CN107330444A CN 107330444 A CN107330444 A CN 107330444A CN 201710396148 A CN201710396148 A CN 201710396148A CN 107330444 A CN107330444 A CN 107330444A
Authority
CN
China
Prior art keywords
sentence
generation
image
arbiter
maker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710396148.5A
Other languages
Chinese (zh)
Inventor
胡伏原
吕凡
沈军宇
孙钰
李林燕
李宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University of Science and Technology
Original Assignee
Suzhou University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University of Science and Technology filed Critical Suzhou University of Science and Technology
Priority to CN201710396148.5A priority Critical patent/CN107330444A/en
Publication of CN107330444A publication Critical patent/CN107330444A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of image autotext mask method based on generation confrontation network, comprise the following steps:False sentence is produced by maker, while rebuilding an arbiter, the sentence of generation and true input by sentence are trained, until arbiter can not determine true sentence and generated statement.The present invention change produced in CNN RNN images automatic sentence mark sentence it is stiff, it is inflexible the problem of, and cause the more accurate sentence of generation, nature, diversity, the sentence of generation scene increasingly complex in can facing the reality, the Expression of language mark image of the mankind is more conformed to, has more be widely applied in practice.

Description

A kind of image autotext mask method based on generation confrontation network
Technical field
Field is marked the present invention relates to image sentence, and in particular to a kind of image autotext based on generation confrontation network Mask method.
Background technology
In recent years, the automatic sentence mark problem of image obtains widely studied.Due to being directed not only to the target of image in itself Identification problem, also relates to natural language processing problem, and current main correlation technique can be summarized as following three kinds:
Semantic template completion method:The method is put the classification text for representing target by obtaining the objectives in image Enter in a fixed spatial term template, automatically generate sentence.By method using the result of target identification constitute one It is individual to include the simple sentence for fixing three semantic primitives.Relation between the target of identification is also together put into same mould by some methods In plate, composition includes the sentence of more multi-semantic meaning.
Feature space matching method:The method constructs a large amount of sentences in advance, by the way that image and the sentence constructed are all thrown The feature space of higher-dimension is mapped to, the close match statement of feature is found.Some methods construct multiple kernel, pass through Ranking mode is compared to the data of each data space, to find relation therebetween.Some methods are proposed by dividing Noise title, label or the statement that may be included in analysis picture, the method mapped for this feature space provide more useful Information.
CNN-RNN methods:The method extracts the feature of image by CNN (convolutional neural networks), inputs the feature into one In individual RNN [29] (Recognition with Recurrent Neural Network), using the training method of NLP (natural language processing), one sentence of training produces mould Block, while training process end to end can be realized.The feature of image zooming-out is directly inputted to circulation nerve net by some methods Network module, incoming LSTM Recognition with Recurrent Neural Network obtains annotation results, and the modelling effect is more outstanding.
Although conventional method can solve the problems, such as mark to a certain extent, still there is certain defect:
Semantic template completion method:This image autotext dimensioning algorithm filled based on semantic template, to a certain degree On can construct the sentence for meeting template, but in actual applications, its language expression ability is very weak, and can answer Scene is relatively limited.
Feature space matching method:This feature space matching method is, it is necessary to which a large amount of phrase datas are supported, and its essence is not It is to produce sentence, but matches existing sentence, the complicated scene in can not facing the reality in actual applications.
CNN-RNN methods:Although the defect of two methods before the method sheet overcomes to a certain extent, due to it Calculated using maximal possibility estimation, the automatic sentence mark of generation is sufficiently close to sample sentence, but apart from authentic context still There is certain gap.Its generated statement lacks lively, naturally statement, seems stiff, inflexible compared to human language.
In recent years, generation confrontation network (GAN, Generative Adversarial Networks) receives academia With the very big attention of industrial quarters, as one of most popular research field over the past two years.It is different from traditional machine learning method, The characteristics of GAN is maximum be to introduce confrontation mechanism, can be used for the modeling and generation of True Data distribution.Currently, generation confrontation Network model has attracted substantial amounts of researcher, is further expanded in all many-sides.As can be seen that with traditional machine Learning method is different, and the characteristics of GAN is maximum is the modeling and generation that can be used in True Data distribution.Make a general survey of existing generation Network method is resisted, it is to be directed to single data field mostly.Therefore, GAN is expected to solve the generated statement life in CNN-RNN methods Hard problem.
The content of the invention
It is an object of the invention to overcome the problem above that prior art is present there is provided a kind of based on generation confrontation network Image autotext mask method, the present invention is summarized based on deep neural network, optical imagery, natural language processing etc. The automatic sentence mark solution of traditional image, is probed into based on the automatic sentence mark side of generation confrontation network research designed image Method and its application.
To realize above-mentioned technical purpose and the technique effect, the present invention is achieved through the following technical solutions:
A kind of image autotext mask method based on generation confrontation network, comprises the following steps:
S 101 marks CNN multi-tags sort module and LSTM sentences generation module as maker, and LSTM sentences is special Extraction module and grader mark are levied as arbiter;
CNN multi-tags sort module extracts the information of picture described in S 102, is then given birth to by LSTM sentences generation module Into sentence, the sentence of generation is the false sentence that the maker is generated;
The sentence of generation and real input by sentence are trained by S 103, the LSTM sentences characteristic extracting module pair The sentence of generation and real sentence are trained, until the arbiter can not differentiate true sentence and generated statement.
Further comprise, also include differentiating that the sentence generated by the maker is by the arbiter in S 103 The method of no description picture, comprises the following steps:
The sentence that the maker is generated is designated as S by S 201fake, real sentence is designated as Sreal, a pictures of training It is designated as Imatch, introduce a unmatched picture and be designated as Imismatch
The generated statement S of S 202fakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted The feature that arrives, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;
Sentence feature in sentence characteristic set is carried out genuine/counterfeit discriminating by grader described in S 203, differentiates the sentence of generation Whether training image is belonged to.
Further comprise, in S203, grader includes during whether the sentence for differentiating generation belongs to training image Combine below:
Sfake ImismatchArbiter can not be passed through;
Sfake ImatchHalf by arbiter, obtains score sf
Sreal ImismatchHalf by arbiter, obtains score sw
Sreal ImatchBy arbiter, score s are obtainedr
Further comprise, the arbiter recognizes true sentence by training, and recognizes whether true sentence matches figure Piece, the loss function of the arbiter is expressed as:
Further comprise, the maker utilizes the automatic sentence marking model generation approaching to reality sentence of multi-tag image Sentence, the loss function of the maker is expressed as:
The beneficial effects of the invention are as follows:
1. the method for the present invention overcomes the automatic sentence mask method of the traditional images defect not enough with final result ability to express, The image autotext marking model based on generation confrontation network is constructed, the model can be applied numerous in deep learning In field, it can apply and help disabled person to understand surrounding environment, effectively description network picture, convenient search;Help fast fast-growing Into news picture mark etc..
2. present invention contact GAN structures, change generation sentence in the automatic sentence mark of CNN-RNN images stiff, inflexible The problem of, and causing the more accurate sentence of generation, nature, diversity, the sentence of generation is more multiple in can facing the reality Miscellaneous scene, more conforms to the Expression of language mark image of the mankind, has more be widely applied in practice.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after. The embodiment of the present invention is shown in detail by following examples and its accompanying drawing.
Brief description of the drawings
Technical scheme in technology in order to illustrate the embodiments of the present invention more clearly, in being described below to embodiment technology The required accompanying drawing used is briefly described, it should be apparent that, drawings in the following description are only some realities of the present invention Example is applied, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to these accompanying drawings Obtain other accompanying drawings.
Fig. 1 is traditional sound field confrontation network structure;
Fig. 2 is LSTM cellular construction figures;
Fig. 3 is image automatic sentence marking structure figure of the present invention based on generation confrontation network;
Fig. 4 is the structure chart for improving arbiter construction.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made Embodiment, belongs to the scope of protection of the invention.
The present embodiment is that generation confrontation network is introduced on the basis of traditional CNN-RNN methods, it is proposed that based on generation Resist the algorithm of the automatic sentence mark of image of network, the problem of overcoming in the automatic sentence mark of traditional images.
Wherein, shown in reference picture 1, tradition generation confrontation network structure is made up of maker G and arbiter D.Wherein, generate Device G receives a noise data z as input, generates an analogue data G (z).Arbiter D is with True Data x or generation number According to G (z) as input, and distinguish whether its input comes from real data distribution pdala(x).Generation confrontation model training is sentenced Other device D differentiates True Data and the accuracy rate of generation data to maximize it, while training maker G to minimize arbiter Accuracy rate.This target is reached by solving following saddle-point problem.
The model can regard a zero-sum game problem as, in true training process, it is often desired to which the effect of arbiter will It is better, it can so supervise the effect of maker.If arbiter effect is poor, the false data of generation is determined as truly Data, then overall effect can be poor.In the training process, arbiter, retraining maker typically first can repeatedly be trained.
LSTM is a kind of Recognition with Recurrent Neural Network of special construction, and its structure as shown in Figure 2, contains three kinds in its structure Door, is to forget door, input gate and out gate respectively.Expression such as formula (2)~(8) that whole LSTM units are calculated.
it=σ (Wxixi+Whihi-1+bi) (2)
ft=σ (Wxfxt+Whfht-1+bf) (3)
ot=σ (Wxoxt+Whoht-1+bo) (4)
gt=tanh (Wxcxt+Whcht-1+bc) (5)
ct=ft⊙ct-1+it⊙gt (6)
ht=ot+t⊙tanh(ct) (7)
pt+1=Softmax (ht) (8)
In the present embodiment, as shown in figure 3, resisting the image autotext mask method of network based on generation, including with Lower step:
S 101 marks CNN multi-tags sort module and LSTM sentences generation module as maker, and LSTM sentences is special Extraction module and grader mark are levied as arbiter;
CNN multi-tags sort module extracts the information of picture described in S 102, is then given birth to by LSTM sentences generation module Into sentence, the sentence of generation is the false sentence that the maker is generated;
The sentence of generation and real input by sentence are trained by S 103, the LSTM sentences characteristic extracting module pair The sentence of generation and real sentence are trained, until the arbiter can not differentiate true sentence and generated statement.
Specifically, being generated as shown in figure 4, also including differentiating by the arbiter in S 103 by the maker The sentence method that whether describes picture, comprise the following steps:
The sentence that the maker is generated is designated as S by S 201fake, real sentence is designated as Sreal, a pictures of training It is designated as Imatch, introduce a unmatched picture and be designated as Imismatch
The generated statement S of S 202fakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted The feature that arrives, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;
Sentence feature in sentence characteristic set is carried out genuine/counterfeit discriminating by grader described in S 203, differentiates the sentence of generation Whether training image is belonged to.
Further comprise, in S203, grader includes during whether the sentence for differentiating generation belongs to training image Combine below:
Sfake ImismatchArbiter can not be passed through;
Sfake ImatchHalf by arbiter, obtains score sf
Sreal ImismatchHalf by arbiter, obtains score sw
Sreal ImatchBy arbiter, score s are obtainedr
Further, the arbiter recognizes true sentence by training, and recognizes whether true sentence matches picture, The loss function of the arbiter is expressed as:
Further, the maker generates the sentence of approaching to reality sentence using the automatic sentence marking model of multi-tag image Son, the loss function of the maker is expressed as:
In the present embodiment, using GAN training method, preferable maker and arbiter will be obtained, so as to lift figure The effect marked as automatic sentence.
The principle of image autotext mask method based on generation confrontation network of the present embodiment is:Tradition generation confrontation The characteristics of network has generation high-quality data, will be by generating using the automatic sentence mark of the multi-tag image of script as maker Device produces false sentence, while rebuilding an arbiter, the sentence of generation and true input by sentence are trained, until Arbiter can not determine true sentence and generated statement.The sentence generated by maker discriminates whether to belong in arbiter Initial data is distributed, it is impossible to which whether judge the sentence is the sentence for describing the picture.Then the sentence maker generated It is designated as Sfake, real sentence is designated as Sreal, a pictures of training are designated as Imatch, introduce a unmatched picture and be designated as Imismatch;Generated statement SfakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted Feature, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;Grader is by sentence Sentence feature in characteristic set carries out genuine/counterfeit discriminating, differentiates whether the sentence of generation belongs to training image.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention. A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one The most wide scope caused.

Claims (5)

1. a kind of image autotext mask method based on generation confrontation network, it is characterised in that comprise the following steps:
S 101 as maker, LSTM sentence features is carried CNN multi-tags sort module and LSTM sentences generation module mark Modulus block and grader mark are used as arbiter;
CNN multi-tags sort module extracts the information of picture described in S 102, then generates language by LSTM sentences generation module Sentence, the sentence of generation is the false sentence that the maker is generated;
The sentence of generation and real input by sentence are trained by S 103, and the LSTM sentences characteristic extracting module is to generation Sentence and real sentence be trained, until the arbiter can not differentiate true sentence and generated statement.
2. the image autotext mask method according to claim 1 based on generation confrontation network, it is characterised in that S Also include differentiating the method whether sentence generated by the maker describes picture by the arbiter in 103, including Following steps:
The sentence that the maker is generated is designated as S by S 201fake, real sentence is designated as Sreal, a pictures of training are designated as Imatch, introduce a unmatched picture and be designated as Imismatch
The generated statement S of S 202fakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted Feature, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;
Sentence feature in sentence characteristic set is carried out genuine/counterfeit discriminating by grader described in S 203, and whether the sentence of differentiation generation Belong to training image.
3. the image autotext mask method according to claim 2 based on generation confrontation network, it is characterised in that In S203, grader includes following combination during whether the sentence for differentiating generation belongs to training image:
Sfake ImismatchArbiter can not be passed through;
Sfake ImatchHalf by arbiter, obtains score sf
Sreal ImismatchHalf by arbiter, obtains score sw
Sreal ImatchBy arbiter, score s are obtainedr
4. the image autotext mask method based on generation confrontation network according to claim 1-3 any one, its It is characterised by, the arbiter recognizes true sentence by training, and recognizes whether true sentence matches picture, the differentiation The loss function of device is expressed as:
5. the image autotext mask method based on generation confrontation network according to claim 1-3 any one, its It is characterised by, the maker generates the sentence of approaching to reality sentence using the automatic sentence marking model of multi-tag image, described The loss function of maker is expressed as:
CN201710396148.5A 2017-05-27 2017-05-27 A kind of image autotext mask method based on generation confrontation network Pending CN107330444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710396148.5A CN107330444A (en) 2017-05-27 2017-05-27 A kind of image autotext mask method based on generation confrontation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710396148.5A CN107330444A (en) 2017-05-27 2017-05-27 A kind of image autotext mask method based on generation confrontation network

Publications (1)

Publication Number Publication Date
CN107330444A true CN107330444A (en) 2017-11-07

Family

ID=60193180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710396148.5A Pending CN107330444A (en) 2017-05-27 2017-05-27 A kind of image autotext mask method based on generation confrontation network

Country Status (1)

Country Link
CN (1) CN107330444A (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944358A (en) * 2017-11-14 2018-04-20 华南理工大学 A kind of human face generating method based on depth convolution confrontation network model
CN107968962A (en) * 2017-12-12 2018-04-27 华中科技大学 A kind of video generation method of the non-conterminous image of two frames based on deep learning
KR101894278B1 (en) * 2018-01-18 2018-09-04 주식회사 뷰노 Method for reconstructing a series of slice images and apparatus using the same
CN108520282A (en) * 2018-04-13 2018-09-11 湘潭大学 A kind of sorting technique based on Triple-GAN
CN108664924A (en) * 2018-05-10 2018-10-16 东南大学 A kind of multi-tag object identification method based on convolutional neural networks
CN108710892A (en) * 2018-04-04 2018-10-26 浙江工业大学 Synergetic immunity defence method towards a variety of confrontation picture attacks
CN109242090A (en) * 2018-08-28 2019-01-18 电子科技大学 A kind of video presentation and description consistency discrimination method based on GAN network
CN109255047A (en) * 2018-07-18 2019-01-22 西安电子科技大学 Based on the complementary semantic mutual search method of image-text being aligned and symmetrically retrieve
CN109614480A (en) * 2018-11-26 2019-04-12 武汉大学 A kind of generation method and device of the autoabstract based on production confrontation network
CN109635273A (en) * 2018-10-25 2019-04-16 平安科技(深圳)有限公司 Text key word extracting method, device, equipment and storage medium
CN109685116A (en) * 2018-11-30 2019-04-26 腾讯科技(深圳)有限公司 Description information of image generation method and device and electronic device
CN109697694A (en) * 2018-12-07 2019-04-30 山东科技大学 The generation method of high-resolution picture based on bull attention mechanism
CN109887494A (en) * 2017-12-01 2019-06-14 腾讯科技(深圳)有限公司 The method and apparatus of reconstructed speech signal
CN109918509A (en) * 2019-03-12 2019-06-21 黑龙江世纪精彩科技有限公司 Scene generating method and scene based on information extraction generate the storage medium of system
CN109933677A (en) * 2019-02-14 2019-06-25 厦门一品威客网络科技股份有限公司 Image generating method and image generation system
CN109978550A (en) * 2019-03-12 2019-07-05 同济大学 A kind of credible electronic transaction clearance mechanism based on generation confrontation network
CN110085215A (en) * 2018-01-23 2019-08-02 中国科学院声学研究所 A kind of language model data Enhancement Method based on generation confrontation network
WO2019179100A1 (en) * 2018-03-20 2019-09-26 苏州大学张家港工业技术研究院 Medical text generation method based on generative adversarial network technology
CN110533074A (en) * 2019-07-30 2019-12-03 华南理工大学 A kind of picture classification automatic marking method and system based on dual-depth neural network
CN110533588A (en) * 2019-07-16 2019-12-03 中国农业大学 Based on the root system image repair method for generating confrontation network
WO2019237860A1 (en) * 2018-06-15 2019-12-19 腾讯科技(深圳)有限公司 Image annotation method and device
CN110889469A (en) * 2019-09-19 2020-03-17 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN111143617A (en) * 2019-12-12 2020-05-12 浙江大学 Automatic generation method and system for picture or video text description
CN111488473A (en) * 2019-01-28 2020-08-04 北京京东尚科信息技术有限公司 Picture description generation method and device and computer readable storage medium
RU2735148C1 (en) * 2019-12-09 2020-10-28 Самсунг Электроникс Ко., Лтд. Training gan (generative adversarial networks) to create pixel-by-pixel annotation
CN112292695A (en) * 2018-06-20 2021-01-29 西门子工业软件公司 Method for generating a test data set, method for testing, method for operating a system, device, control system, computer program product, computer-readable medium, generation and application
CN112347742A (en) * 2020-10-29 2021-02-09 青岛科技大学 Method for generating document image set based on deep learning
CN112818159A (en) * 2021-02-24 2021-05-18 上海交通大学 Image description text generation method based on generation countermeasure network
CN113077013A (en) * 2021-04-28 2021-07-06 上海联麓半导体技术有限公司 High-dimensional data fault anomaly detection method and system based on generation countermeasure network
CN114241263A (en) * 2021-12-17 2022-03-25 电子科技大学 Radar interference semi-supervised open set identification system based on generation countermeasure network
US11514694B2 (en) 2019-09-20 2022-11-29 Samsung Electronics Co., Ltd. Teaching GAN (generative adversarial networks) to generate per-pixel annotation
CN116795972A (en) * 2023-08-11 2023-09-22 之江实验室 Model training method and device, storage medium and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170150235A1 (en) * 2015-11-20 2017-05-25 Microsoft Technology Licensing, Llc Jointly Modeling Embedding and Translation to Bridge Video and Language

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170150235A1 (en) * 2015-11-20 2017-05-25 Microsoft Technology Licensing, Llc Jointly Modeling Embedding and Translation to Bridge Video and Language

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BO DAI等: "Towards Diverse and Natural Image Descriptions via a Conditional GAN", 《ARXIV》 *
ORIOL VINYALS: "Show and Tell: A Neural Image Caption Generator", 《CVPR 2015》 *

Cited By (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107944358A (en) * 2017-11-14 2018-04-20 华南理工大学 A kind of human face generating method based on depth convolution confrontation network model
US11482237B2 (en) 2017-12-01 2022-10-25 Tencent Technology (Shenzhen) Company Limited Method and terminal for reconstructing speech signal, and computer storage medium
CN109887494A (en) * 2017-12-01 2019-06-14 腾讯科技(深圳)有限公司 The method and apparatus of reconstructed speech signal
CN107968962A (en) * 2017-12-12 2018-04-27 华中科技大学 A kind of video generation method of the non-conterminous image of two frames based on deep learning
KR101894278B1 (en) * 2018-01-18 2018-09-04 주식회사 뷰노 Method for reconstructing a series of slice images and apparatus using the same
US11816833B2 (en) 2018-01-18 2023-11-14 Vuno Inc. Method for reconstructing series of slice images and apparatus using same
CN110085215A (en) * 2018-01-23 2019-08-02 中国科学院声学研究所 A kind of language model data Enhancement Method based on generation confrontation network
CN110085215B (en) * 2018-01-23 2021-06-08 中国科学院声学研究所 Language model data enhancement method based on generation countermeasure network
WO2019179100A1 (en) * 2018-03-20 2019-09-26 苏州大学张家港工业技术研究院 Medical text generation method based on generative adversarial network technology
CN108710892B (en) * 2018-04-04 2020-09-01 浙江工业大学 Cooperative immune defense method for multiple anti-picture attacks
CN108710892A (en) * 2018-04-04 2018-10-26 浙江工业大学 Synergetic immunity defence method towards a variety of confrontation picture attacks
CN108520282B (en) * 2018-04-13 2020-04-03 湘潭大学 Triple-GAN-based classification method
CN108520282A (en) * 2018-04-13 2018-09-11 湘潭大学 A kind of sorting technique based on Triple-GAN
CN108664924B (en) * 2018-05-10 2022-07-08 东南大学 Multi-label object identification method based on convolutional neural network
CN108664924A (en) * 2018-05-10 2018-10-16 东南大学 A kind of multi-tag object identification method based on convolutional neural networks
US11494595B2 (en) 2018-06-15 2022-11-08 Tencent Technology (Shenzhen) Company Limited Method , apparatus, and storage medium for annotating image
WO2019237860A1 (en) * 2018-06-15 2019-12-19 腾讯科技(深圳)有限公司 Image annotation method and device
CN112292695A (en) * 2018-06-20 2021-01-29 西门子工业软件公司 Method for generating a test data set, method for testing, method for operating a system, device, control system, computer program product, computer-readable medium, generation and application
CN109255047A (en) * 2018-07-18 2019-01-22 西安电子科技大学 Based on the complementary semantic mutual search method of image-text being aligned and symmetrically retrieve
CN109242090A (en) * 2018-08-28 2019-01-18 电子科技大学 A kind of video presentation and description consistency discrimination method based on GAN network
CN109635273A (en) * 2018-10-25 2019-04-16 平安科技(深圳)有限公司 Text key word extracting method, device, equipment and storage medium
CN109614480A (en) * 2018-11-26 2019-04-12 武汉大学 A kind of generation method and device of the autoabstract based on production confrontation network
CN109685116B (en) * 2018-11-30 2022-12-30 腾讯科技(深圳)有限公司 Image description information generation method and device and electronic device
US11783199B2 (en) * 2018-11-30 2023-10-10 Tencent Technology (Shenzhen) Company Limited Image description information generation method and apparatus, and electronic device
CN109685116A (en) * 2018-11-30 2019-04-26 腾讯科技(深圳)有限公司 Description information of image generation method and device and electronic device
WO2020108165A1 (en) * 2018-11-30 2020-06-04 腾讯科技(深圳)有限公司 Image description information generation method and device, and electronic device
US20210042579A1 (en) * 2018-11-30 2021-02-11 Tencent Technology (Shenzhen) Company Limited Image description information generation method and apparatus, and electronic device
CN109697694B (en) * 2018-12-07 2023-04-07 山东科技大学 Method for generating high-resolution picture based on multi-head attention mechanism
CN109697694A (en) * 2018-12-07 2019-04-30 山东科技大学 The generation method of high-resolution picture based on bull attention mechanism
CN111488473A (en) * 2019-01-28 2020-08-04 北京京东尚科信息技术有限公司 Picture description generation method and device and computer readable storage medium
CN111488473B (en) * 2019-01-28 2023-11-07 北京京东尚科信息技术有限公司 Picture description generation method, device and computer readable storage medium
CN109933677A (en) * 2019-02-14 2019-06-25 厦门一品威客网络科技股份有限公司 Image generating method and image generation system
CN109918509A (en) * 2019-03-12 2019-06-21 黑龙江世纪精彩科技有限公司 Scene generating method and scene based on information extraction generate the storage medium of system
CN109978550A (en) * 2019-03-12 2019-07-05 同济大学 A kind of credible electronic transaction clearance mechanism based on generation confrontation network
CN110533588A (en) * 2019-07-16 2019-12-03 中国农业大学 Based on the root system image repair method for generating confrontation network
CN110533588B (en) * 2019-07-16 2021-09-21 中国农业大学 Root system image restoration method based on generation of countermeasure network
CN110533074B (en) * 2019-07-30 2022-03-29 华南理工大学 Automatic image category labeling method and system based on double-depth neural network
CN110533074A (en) * 2019-07-30 2019-12-03 华南理工大学 A kind of picture classification automatic marking method and system based on dual-depth neural network
CN110889469B (en) * 2019-09-19 2023-07-21 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
CN110889469A (en) * 2019-09-19 2020-03-17 北京市商汤科技开发有限公司 Image processing method and device, electronic equipment and storage medium
US11514694B2 (en) 2019-09-20 2022-11-29 Samsung Electronics Co., Ltd. Teaching GAN (generative adversarial networks) to generate per-pixel annotation
RU2735148C1 (en) * 2019-12-09 2020-10-28 Самсунг Электроникс Ко., Лтд. Training gan (generative adversarial networks) to create pixel-by-pixel annotation
CN111143617A (en) * 2019-12-12 2020-05-12 浙江大学 Automatic generation method and system for picture or video text description
CN112347742B (en) * 2020-10-29 2022-05-31 青岛科技大学 Method for generating document image set based on deep learning
CN112347742A (en) * 2020-10-29 2021-02-09 青岛科技大学 Method for generating document image set based on deep learning
CN112818159A (en) * 2021-02-24 2021-05-18 上海交通大学 Image description text generation method based on generation countermeasure network
CN113077013A (en) * 2021-04-28 2021-07-06 上海联麓半导体技术有限公司 High-dimensional data fault anomaly detection method and system based on generation countermeasure network
CN114241263A (en) * 2021-12-17 2022-03-25 电子科技大学 Radar interference semi-supervised open set identification system based on generation countermeasure network
CN114241263B (en) * 2021-12-17 2023-05-02 电子科技大学 Radar interference semi-supervised open set recognition system based on generation of countermeasure network
CN116795972A (en) * 2023-08-11 2023-09-22 之江实验室 Model training method and device, storage medium and electronic equipment
CN116795972B (en) * 2023-08-11 2024-01-09 之江实验室 Model training method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN107330444A (en) A kind of image autotext mask method based on generation confrontation network
CN106778506A (en) A kind of expression recognition method for merging depth image and multi-channel feature
CN110443231A (en) A kind of fingers of single hand point reading character recognition method and system based on artificial intelligence
CN110175251A (en) The zero sample Sketch Searching method based on semantic confrontation network
CN107506722A (en) One kind is based on depth sparse convolution neutral net face emotion identification method
CN106202044A (en) A kind of entity relation extraction method based on deep neural network
CN108416065A (en) Image based on level neural network-sentence description generates system and method
CN108984530A (en) A kind of detection method and detection system of network sensitive content
CN110516539A (en) Remote sensing image building extracting method, system, storage medium and equipment based on confrontation network
CN110458003B (en) Facial expression action unit countermeasure synthesis method based on local attention model
CN107392147A (en) A kind of image sentence conversion method based on improved production confrontation network
CN108182409A (en) Biopsy method, device, equipment and storage medium
CN110009057A (en) A kind of graphical verification code recognition methods based on deep learning
CN106875007A (en) End-to-end deep neural network is remembered based on convolution shot and long term for voice fraud detection
CN107145514B (en) Chinese sentence pattern classification method based on decision tree and SVM mixed model
CN110532912A (en) A kind of sign language interpreter implementation method and device
CN108765383A (en) Video presentation method based on depth migration study
CN112541529A (en) Expression and posture fusion bimodal teaching evaluation method, device and storage medium
CN109934204A (en) A kind of facial expression recognizing method based on convolutional neural networks
CN107066979A (en) A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks
CN113642621A (en) Zero sample image classification method based on generation countermeasure network
CN109711356A (en) A kind of expression recognition method and system
CN112069993B (en) Dense face detection method and system based on five-sense organ mask constraint and storage medium
CN109670559A (en) Recognition methods, device, equipment and the storage medium of handwritten Chinese character
CN109871898A (en) A method of deposit training sample is generated using confrontation network is generated

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20171107

RJ01 Rejection of invention patent application after publication