CN107330444A - A kind of image autotext mask method based on generation confrontation network - Google Patents
A kind of image autotext mask method based on generation confrontation network Download PDFInfo
- Publication number
- CN107330444A CN107330444A CN201710396148.5A CN201710396148A CN107330444A CN 107330444 A CN107330444 A CN 107330444A CN 201710396148 A CN201710396148 A CN 201710396148A CN 107330444 A CN107330444 A CN 107330444A
- Authority
- CN
- China
- Prior art keywords
- sentence
- generation
- image
- arbiter
- maker
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of image autotext mask method based on generation confrontation network, comprise the following steps:False sentence is produced by maker, while rebuilding an arbiter, the sentence of generation and true input by sentence are trained, until arbiter can not determine true sentence and generated statement.The present invention change produced in CNN RNN images automatic sentence mark sentence it is stiff, it is inflexible the problem of, and cause the more accurate sentence of generation, nature, diversity, the sentence of generation scene increasingly complex in can facing the reality, the Expression of language mark image of the mankind is more conformed to, has more be widely applied in practice.
Description
Technical field
Field is marked the present invention relates to image sentence, and in particular to a kind of image autotext based on generation confrontation network
Mask method.
Background technology
In recent years, the automatic sentence mark problem of image obtains widely studied.Due to being directed not only to the target of image in itself
Identification problem, also relates to natural language processing problem, and current main correlation technique can be summarized as following three kinds:
Semantic template completion method:The method is put the classification text for representing target by obtaining the objectives in image
Enter in a fixed spatial term template, automatically generate sentence.By method using the result of target identification constitute one
It is individual to include the simple sentence for fixing three semantic primitives.Relation between the target of identification is also together put into same mould by some methods
In plate, composition includes the sentence of more multi-semantic meaning.
Feature space matching method:The method constructs a large amount of sentences in advance, by the way that image and the sentence constructed are all thrown
The feature space of higher-dimension is mapped to, the close match statement of feature is found.Some methods construct multiple kernel, pass through
Ranking mode is compared to the data of each data space, to find relation therebetween.Some methods are proposed by dividing
Noise title, label or the statement that may be included in analysis picture, the method mapped for this feature space provide more useful
Information.
CNN-RNN methods:The method extracts the feature of image by CNN (convolutional neural networks), inputs the feature into one
In individual RNN [29] (Recognition with Recurrent Neural Network), using the training method of NLP (natural language processing), one sentence of training produces mould
Block, while training process end to end can be realized.The feature of image zooming-out is directly inputted to circulation nerve net by some methods
Network module, incoming LSTM Recognition with Recurrent Neural Network obtains annotation results, and the modelling effect is more outstanding.
Although conventional method can solve the problems, such as mark to a certain extent, still there is certain defect:
Semantic template completion method:This image autotext dimensioning algorithm filled based on semantic template, to a certain degree
On can construct the sentence for meeting template, but in actual applications, its language expression ability is very weak, and can answer
Scene is relatively limited.
Feature space matching method:This feature space matching method is, it is necessary to which a large amount of phrase datas are supported, and its essence is not
It is to produce sentence, but matches existing sentence, the complicated scene in can not facing the reality in actual applications.
CNN-RNN methods:Although the defect of two methods before the method sheet overcomes to a certain extent, due to it
Calculated using maximal possibility estimation, the automatic sentence mark of generation is sufficiently close to sample sentence, but apart from authentic context still
There is certain gap.Its generated statement lacks lively, naturally statement, seems stiff, inflexible compared to human language.
In recent years, generation confrontation network (GAN, Generative Adversarial Networks) receives academia
With the very big attention of industrial quarters, as one of most popular research field over the past two years.It is different from traditional machine learning method,
The characteristics of GAN is maximum be to introduce confrontation mechanism, can be used for the modeling and generation of True Data distribution.Currently, generation confrontation
Network model has attracted substantial amounts of researcher, is further expanded in all many-sides.As can be seen that with traditional machine
Learning method is different, and the characteristics of GAN is maximum is the modeling and generation that can be used in True Data distribution.Make a general survey of existing generation
Network method is resisted, it is to be directed to single data field mostly.Therefore, GAN is expected to solve the generated statement life in CNN-RNN methods
Hard problem.
The content of the invention
It is an object of the invention to overcome the problem above that prior art is present there is provided a kind of based on generation confrontation network
Image autotext mask method, the present invention is summarized based on deep neural network, optical imagery, natural language processing etc.
The automatic sentence mark solution of traditional image, is probed into based on the automatic sentence mark side of generation confrontation network research designed image
Method and its application.
To realize above-mentioned technical purpose and the technique effect, the present invention is achieved through the following technical solutions:
A kind of image autotext mask method based on generation confrontation network, comprises the following steps:
S 101 marks CNN multi-tags sort module and LSTM sentences generation module as maker, and LSTM sentences is special
Extraction module and grader mark are levied as arbiter;
CNN multi-tags sort module extracts the information of picture described in S 102, is then given birth to by LSTM sentences generation module
Into sentence, the sentence of generation is the false sentence that the maker is generated;
The sentence of generation and real input by sentence are trained by S 103, the LSTM sentences characteristic extracting module pair
The sentence of generation and real sentence are trained, until the arbiter can not differentiate true sentence and generated statement.
Further comprise, also include differentiating that the sentence generated by the maker is by the arbiter in S 103
The method of no description picture, comprises the following steps:
The sentence that the maker is generated is designated as S by S 201fake, real sentence is designated as Sreal, a pictures of training
It is designated as Imatch, introduce a unmatched picture and be designated as Imismatch;
The generated statement S of S 202fakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted
The feature that arrives, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;
Sentence feature in sentence characteristic set is carried out genuine/counterfeit discriminating by grader described in S 203, differentiates the sentence of generation
Whether training image is belonged to.
Further comprise, in S203, grader includes during whether the sentence for differentiating generation belongs to training image
Combine below:
Sfake ImismatchArbiter can not be passed through;
Sfake ImatchHalf by arbiter, obtains score sf;
Sreal ImismatchHalf by arbiter, obtains score sw;
Sreal ImatchBy arbiter, score s are obtainedr。
Further comprise, the arbiter recognizes true sentence by training, and recognizes whether true sentence matches figure
Piece, the loss function of the arbiter is expressed as:
Further comprise, the maker utilizes the automatic sentence marking model generation approaching to reality sentence of multi-tag image
Sentence, the loss function of the maker is expressed as:
The beneficial effects of the invention are as follows:
1. the method for the present invention overcomes the automatic sentence mask method of the traditional images defect not enough with final result ability to express,
The image autotext marking model based on generation confrontation network is constructed, the model can be applied numerous in deep learning
In field, it can apply and help disabled person to understand surrounding environment, effectively description network picture, convenient search;Help fast fast-growing
Into news picture mark etc..
2. present invention contact GAN structures, change generation sentence in the automatic sentence mark of CNN-RNN images stiff, inflexible
The problem of, and causing the more accurate sentence of generation, nature, diversity, the sentence of generation is more multiple in can facing the reality
Miscellaneous scene, more conforms to the Expression of language mark image of the mankind, has more be widely applied in practice.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of specification, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after.
The embodiment of the present invention is shown in detail by following examples and its accompanying drawing.
Brief description of the drawings
Technical scheme in technology in order to illustrate the embodiments of the present invention more clearly, in being described below to embodiment technology
The required accompanying drawing used is briefly described, it should be apparent that, drawings in the following description are only some realities of the present invention
Example is applied, for those of ordinary skill in the art, on the premise of not paying creative work, can also be according to these accompanying drawings
Obtain other accompanying drawings.
Fig. 1 is traditional sound field confrontation network structure;
Fig. 2 is LSTM cellular construction figures;
Fig. 3 is image automatic sentence marking structure figure of the present invention based on generation confrontation network;
Fig. 4 is the structure chart for improving arbiter construction.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under the premise of creative work is not made
Embodiment, belongs to the scope of protection of the invention.
The present embodiment is that generation confrontation network is introduced on the basis of traditional CNN-RNN methods, it is proposed that based on generation
Resist the algorithm of the automatic sentence mark of image of network, the problem of overcoming in the automatic sentence mark of traditional images.
Wherein, shown in reference picture 1, tradition generation confrontation network structure is made up of maker G and arbiter D.Wherein, generate
Device G receives a noise data z as input, generates an analogue data G (z).Arbiter D is with True Data x or generation number
According to G (z) as input, and distinguish whether its input comes from real data distribution pdala(x).Generation confrontation model training is sentenced
Other device D differentiates True Data and the accuracy rate of generation data to maximize it, while training maker G to minimize arbiter
Accuracy rate.This target is reached by solving following saddle-point problem.
The model can regard a zero-sum game problem as, in true training process, it is often desired to which the effect of arbiter will
It is better, it can so supervise the effect of maker.If arbiter effect is poor, the false data of generation is determined as truly
Data, then overall effect can be poor.In the training process, arbiter, retraining maker typically first can repeatedly be trained.
LSTM is a kind of Recognition with Recurrent Neural Network of special construction, and its structure as shown in Figure 2, contains three kinds in its structure
Door, is to forget door, input gate and out gate respectively.Expression such as formula (2)~(8) that whole LSTM units are calculated.
it=σ (Wxixi+Whihi-1+bi) (2)
ft=σ (Wxfxt+Whfht-1+bf) (3)
ot=σ (Wxoxt+Whoht-1+bo) (4)
gt=tanh (Wxcxt+Whcht-1+bc) (5)
ct=ft⊙ct-1+it⊙gt (6)
ht=ot+t⊙tanh(ct) (7)
pt+1=Softmax (ht) (8)
In the present embodiment, as shown in figure 3, resisting the image autotext mask method of network based on generation, including with
Lower step:
S 101 marks CNN multi-tags sort module and LSTM sentences generation module as maker, and LSTM sentences is special
Extraction module and grader mark are levied as arbiter;
CNN multi-tags sort module extracts the information of picture described in S 102, is then given birth to by LSTM sentences generation module
Into sentence, the sentence of generation is the false sentence that the maker is generated;
The sentence of generation and real input by sentence are trained by S 103, the LSTM sentences characteristic extracting module pair
The sentence of generation and real sentence are trained, until the arbiter can not differentiate true sentence and generated statement.
Specifically, being generated as shown in figure 4, also including differentiating by the arbiter in S 103 by the maker
The sentence method that whether describes picture, comprise the following steps:
The sentence that the maker is generated is designated as S by S 201fake, real sentence is designated as Sreal, a pictures of training
It is designated as Imatch, introduce a unmatched picture and be designated as Imismatch;
The generated statement S of S 202fakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted
The feature that arrives, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;
Sentence feature in sentence characteristic set is carried out genuine/counterfeit discriminating by grader described in S 203, differentiates the sentence of generation
Whether training image is belonged to.
Further comprise, in S203, grader includes during whether the sentence for differentiating generation belongs to training image
Combine below:
Sfake ImismatchArbiter can not be passed through;
Sfake ImatchHalf by arbiter, obtains score sf;
Sreal ImismatchHalf by arbiter, obtains score sw;
Sreal ImatchBy arbiter, score s are obtainedr。
Further, the arbiter recognizes true sentence by training, and recognizes whether true sentence matches picture,
The loss function of the arbiter is expressed as:
Further, the maker generates the sentence of approaching to reality sentence using the automatic sentence marking model of multi-tag image
Son, the loss function of the maker is expressed as:
In the present embodiment, using GAN training method, preferable maker and arbiter will be obtained, so as to lift figure
The effect marked as automatic sentence.
The principle of image autotext mask method based on generation confrontation network of the present embodiment is:Tradition generation confrontation
The characteristics of network has generation high-quality data, will be by generating using the automatic sentence mark of the multi-tag image of script as maker
Device produces false sentence, while rebuilding an arbiter, the sentence of generation and true input by sentence are trained, until
Arbiter can not determine true sentence and generated statement.The sentence generated by maker discriminates whether to belong in arbiter
Initial data is distributed, it is impossible to which whether judge the sentence is the sentence for describing the picture.Then the sentence maker generated
It is designated as Sfake, real sentence is designated as Sreal, a pictures of training are designated as Imatch, introduce a unmatched picture and be designated as
Imismatch;Generated statement SfakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted
Feature, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;Grader is by sentence
Sentence feature in characteristic set carries out genuine/counterfeit discriminating, differentiates whether the sentence of generation belongs to training image.
The foregoing description of the disclosed embodiments, enables professional and technical personnel in the field to realize or using the present invention.
A variety of modifications to these embodiments will be apparent for those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and principles disclosed herein and features of novelty phase one
The most wide scope caused.
Claims (5)
1. a kind of image autotext mask method based on generation confrontation network, it is characterised in that comprise the following steps:
S 101 as maker, LSTM sentence features is carried CNN multi-tags sort module and LSTM sentences generation module mark
Modulus block and grader mark are used as arbiter;
CNN multi-tags sort module extracts the information of picture described in S 102, then generates language by LSTM sentences generation module
Sentence, the sentence of generation is the false sentence that the maker is generated;
The sentence of generation and real input by sentence are trained by S 103, and the LSTM sentences characteristic extracting module is to generation
Sentence and real sentence be trained, until the arbiter can not differentiate true sentence and generated statement.
2. the image autotext mask method according to claim 1 based on generation confrontation network, it is characterised in that S
Also include differentiating the method whether sentence generated by the maker describes picture by the arbiter in 103, including
Following steps:
The sentence that the maker is generated is designated as S by S 201fake, real sentence is designated as Sreal, a pictures of training are designated as
Imatch, introduce a unmatched picture and be designated as Imismatch;
The generated statement S of S 202fakeWith true sentence SrealFeature extraction is carried out by LSTM sentences characteristic extracting module, extracted
Feature, Match characteristics of image, Mismatch characteristics of image carry out feature combination, obtain sentence characteristic set;
Sentence feature in sentence characteristic set is carried out genuine/counterfeit discriminating by grader described in S 203, and whether the sentence of differentiation generation
Belong to training image.
3. the image autotext mask method according to claim 2 based on generation confrontation network, it is characterised in that
In S203, grader includes following combination during whether the sentence for differentiating generation belongs to training image:
Sfake ImismatchArbiter can not be passed through;
Sfake ImatchHalf by arbiter, obtains score sf;
Sreal ImismatchHalf by arbiter, obtains score sw;
Sreal ImatchBy arbiter, score s are obtainedr。
4. the image autotext mask method based on generation confrontation network according to claim 1-3 any one, its
It is characterised by, the arbiter recognizes true sentence by training, and recognizes whether true sentence matches picture, the differentiation
The loss function of device is expressed as:
5. the image autotext mask method based on generation confrontation network according to claim 1-3 any one, its
It is characterised by, the maker generates the sentence of approaching to reality sentence using the automatic sentence marking model of multi-tag image, described
The loss function of maker is expressed as:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710396148.5A CN107330444A (en) | 2017-05-27 | 2017-05-27 | A kind of image autotext mask method based on generation confrontation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710396148.5A CN107330444A (en) | 2017-05-27 | 2017-05-27 | A kind of image autotext mask method based on generation confrontation network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107330444A true CN107330444A (en) | 2017-11-07 |
Family
ID=60193180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710396148.5A Pending CN107330444A (en) | 2017-05-27 | 2017-05-27 | A kind of image autotext mask method based on generation confrontation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107330444A (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107944358A (en) * | 2017-11-14 | 2018-04-20 | 华南理工大学 | A kind of human face generating method based on depth convolution confrontation network model |
CN107968962A (en) * | 2017-12-12 | 2018-04-27 | 华中科技大学 | A kind of video generation method of the non-conterminous image of two frames based on deep learning |
KR101894278B1 (en) * | 2018-01-18 | 2018-09-04 | 주식회사 뷰노 | Method for reconstructing a series of slice images and apparatus using the same |
CN108520282A (en) * | 2018-04-13 | 2018-09-11 | 湘潭大学 | A kind of sorting technique based on Triple-GAN |
CN108664924A (en) * | 2018-05-10 | 2018-10-16 | 东南大学 | A kind of multi-tag object identification method based on convolutional neural networks |
CN108710892A (en) * | 2018-04-04 | 2018-10-26 | 浙江工业大学 | Synergetic immunity defence method towards a variety of confrontation picture attacks |
CN109242090A (en) * | 2018-08-28 | 2019-01-18 | 电子科技大学 | A kind of video presentation and description consistency discrimination method based on GAN network |
CN109255047A (en) * | 2018-07-18 | 2019-01-22 | 西安电子科技大学 | Based on the complementary semantic mutual search method of image-text being aligned and symmetrically retrieve |
CN109614480A (en) * | 2018-11-26 | 2019-04-12 | 武汉大学 | A kind of generation method and device of the autoabstract based on production confrontation network |
CN109635273A (en) * | 2018-10-25 | 2019-04-16 | 平安科技(深圳)有限公司 | Text key word extracting method, device, equipment and storage medium |
CN109685116A (en) * | 2018-11-30 | 2019-04-26 | 腾讯科技(深圳)有限公司 | Description information of image generation method and device and electronic device |
CN109697694A (en) * | 2018-12-07 | 2019-04-30 | 山东科技大学 | The generation method of high-resolution picture based on bull attention mechanism |
CN109887494A (en) * | 2017-12-01 | 2019-06-14 | 腾讯科技(深圳)有限公司 | The method and apparatus of reconstructed speech signal |
CN109918509A (en) * | 2019-03-12 | 2019-06-21 | 黑龙江世纪精彩科技有限公司 | Scene generating method and scene based on information extraction generate the storage medium of system |
CN109933677A (en) * | 2019-02-14 | 2019-06-25 | 厦门一品威客网络科技股份有限公司 | Image generating method and image generation system |
CN109978550A (en) * | 2019-03-12 | 2019-07-05 | 同济大学 | A kind of credible electronic transaction clearance mechanism based on generation confrontation network |
CN110085215A (en) * | 2018-01-23 | 2019-08-02 | 中国科学院声学研究所 | A kind of language model data Enhancement Method based on generation confrontation network |
WO2019179100A1 (en) * | 2018-03-20 | 2019-09-26 | 苏州大学张家港工业技术研究院 | Medical text generation method based on generative adversarial network technology |
CN110533074A (en) * | 2019-07-30 | 2019-12-03 | 华南理工大学 | A kind of picture classification automatic marking method and system based on dual-depth neural network |
CN110533588A (en) * | 2019-07-16 | 2019-12-03 | 中国农业大学 | Based on the root system image repair method for generating confrontation network |
WO2019237860A1 (en) * | 2018-06-15 | 2019-12-19 | 腾讯科技(深圳)有限公司 | Image annotation method and device |
CN110889469A (en) * | 2019-09-19 | 2020-03-17 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN111143617A (en) * | 2019-12-12 | 2020-05-12 | 浙江大学 | Automatic generation method and system for picture or video text description |
CN111488473A (en) * | 2019-01-28 | 2020-08-04 | 北京京东尚科信息技术有限公司 | Picture description generation method and device and computer readable storage medium |
RU2735148C1 (en) * | 2019-12-09 | 2020-10-28 | Самсунг Электроникс Ко., Лтд. | Training gan (generative adversarial networks) to create pixel-by-pixel annotation |
CN112292695A (en) * | 2018-06-20 | 2021-01-29 | 西门子工业软件公司 | Method for generating a test data set, method for testing, method for operating a system, device, control system, computer program product, computer-readable medium, generation and application |
CN112347742A (en) * | 2020-10-29 | 2021-02-09 | 青岛科技大学 | Method for generating document image set based on deep learning |
CN112818159A (en) * | 2021-02-24 | 2021-05-18 | 上海交通大学 | Image description text generation method based on generation countermeasure network |
CN113077013A (en) * | 2021-04-28 | 2021-07-06 | 上海联麓半导体技术有限公司 | High-dimensional data fault anomaly detection method and system based on generation countermeasure network |
CN114241263A (en) * | 2021-12-17 | 2022-03-25 | 电子科技大学 | Radar interference semi-supervised open set identification system based on generation countermeasure network |
US11514694B2 (en) | 2019-09-20 | 2022-11-29 | Samsung Electronics Co., Ltd. | Teaching GAN (generative adversarial networks) to generate per-pixel annotation |
CN116795972A (en) * | 2023-08-11 | 2023-09-22 | 之江实验室 | Model training method and device, storage medium and electronic equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170150235A1 (en) * | 2015-11-20 | 2017-05-25 | Microsoft Technology Licensing, Llc | Jointly Modeling Embedding and Translation to Bridge Video and Language |
-
2017
- 2017-05-27 CN CN201710396148.5A patent/CN107330444A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170150235A1 (en) * | 2015-11-20 | 2017-05-25 | Microsoft Technology Licensing, Llc | Jointly Modeling Embedding and Translation to Bridge Video and Language |
Non-Patent Citations (2)
Title |
---|
BO DAI等: "Towards Diverse and Natural Image Descriptions via a Conditional GAN", 《ARXIV》 * |
ORIOL VINYALS: "Show and Tell: A Neural Image Caption Generator", 《CVPR 2015》 * |
Cited By (51)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107944358A (en) * | 2017-11-14 | 2018-04-20 | 华南理工大学 | A kind of human face generating method based on depth convolution confrontation network model |
US11482237B2 (en) | 2017-12-01 | 2022-10-25 | Tencent Technology (Shenzhen) Company Limited | Method and terminal for reconstructing speech signal, and computer storage medium |
CN109887494A (en) * | 2017-12-01 | 2019-06-14 | 腾讯科技(深圳)有限公司 | The method and apparatus of reconstructed speech signal |
CN107968962A (en) * | 2017-12-12 | 2018-04-27 | 华中科技大学 | A kind of video generation method of the non-conterminous image of two frames based on deep learning |
KR101894278B1 (en) * | 2018-01-18 | 2018-09-04 | 주식회사 뷰노 | Method for reconstructing a series of slice images and apparatus using the same |
US11816833B2 (en) | 2018-01-18 | 2023-11-14 | Vuno Inc. | Method for reconstructing series of slice images and apparatus using same |
CN110085215A (en) * | 2018-01-23 | 2019-08-02 | 中国科学院声学研究所 | A kind of language model data Enhancement Method based on generation confrontation network |
CN110085215B (en) * | 2018-01-23 | 2021-06-08 | 中国科学院声学研究所 | Language model data enhancement method based on generation countermeasure network |
WO2019179100A1 (en) * | 2018-03-20 | 2019-09-26 | 苏州大学张家港工业技术研究院 | Medical text generation method based on generative adversarial network technology |
CN108710892B (en) * | 2018-04-04 | 2020-09-01 | 浙江工业大学 | Cooperative immune defense method for multiple anti-picture attacks |
CN108710892A (en) * | 2018-04-04 | 2018-10-26 | 浙江工业大学 | Synergetic immunity defence method towards a variety of confrontation picture attacks |
CN108520282B (en) * | 2018-04-13 | 2020-04-03 | 湘潭大学 | Triple-GAN-based classification method |
CN108520282A (en) * | 2018-04-13 | 2018-09-11 | 湘潭大学 | A kind of sorting technique based on Triple-GAN |
CN108664924B (en) * | 2018-05-10 | 2022-07-08 | 东南大学 | Multi-label object identification method based on convolutional neural network |
CN108664924A (en) * | 2018-05-10 | 2018-10-16 | 东南大学 | A kind of multi-tag object identification method based on convolutional neural networks |
US11494595B2 (en) | 2018-06-15 | 2022-11-08 | Tencent Technology (Shenzhen) Company Limited | Method , apparatus, and storage medium for annotating image |
WO2019237860A1 (en) * | 2018-06-15 | 2019-12-19 | 腾讯科技(深圳)有限公司 | Image annotation method and device |
CN112292695A (en) * | 2018-06-20 | 2021-01-29 | 西门子工业软件公司 | Method for generating a test data set, method for testing, method for operating a system, device, control system, computer program product, computer-readable medium, generation and application |
CN109255047A (en) * | 2018-07-18 | 2019-01-22 | 西安电子科技大学 | Based on the complementary semantic mutual search method of image-text being aligned and symmetrically retrieve |
CN109242090A (en) * | 2018-08-28 | 2019-01-18 | 电子科技大学 | A kind of video presentation and description consistency discrimination method based on GAN network |
CN109635273A (en) * | 2018-10-25 | 2019-04-16 | 平安科技(深圳)有限公司 | Text key word extracting method, device, equipment and storage medium |
CN109614480A (en) * | 2018-11-26 | 2019-04-12 | 武汉大学 | A kind of generation method and device of the autoabstract based on production confrontation network |
CN109685116B (en) * | 2018-11-30 | 2022-12-30 | 腾讯科技(深圳)有限公司 | Image description information generation method and device and electronic device |
US11783199B2 (en) * | 2018-11-30 | 2023-10-10 | Tencent Technology (Shenzhen) Company Limited | Image description information generation method and apparatus, and electronic device |
CN109685116A (en) * | 2018-11-30 | 2019-04-26 | 腾讯科技(深圳)有限公司 | Description information of image generation method and device and electronic device |
WO2020108165A1 (en) * | 2018-11-30 | 2020-06-04 | 腾讯科技(深圳)有限公司 | Image description information generation method and device, and electronic device |
US20210042579A1 (en) * | 2018-11-30 | 2021-02-11 | Tencent Technology (Shenzhen) Company Limited | Image description information generation method and apparatus, and electronic device |
CN109697694B (en) * | 2018-12-07 | 2023-04-07 | 山东科技大学 | Method for generating high-resolution picture based on multi-head attention mechanism |
CN109697694A (en) * | 2018-12-07 | 2019-04-30 | 山东科技大学 | The generation method of high-resolution picture based on bull attention mechanism |
CN111488473A (en) * | 2019-01-28 | 2020-08-04 | 北京京东尚科信息技术有限公司 | Picture description generation method and device and computer readable storage medium |
CN111488473B (en) * | 2019-01-28 | 2023-11-07 | 北京京东尚科信息技术有限公司 | Picture description generation method, device and computer readable storage medium |
CN109933677A (en) * | 2019-02-14 | 2019-06-25 | 厦门一品威客网络科技股份有限公司 | Image generating method and image generation system |
CN109918509A (en) * | 2019-03-12 | 2019-06-21 | 黑龙江世纪精彩科技有限公司 | Scene generating method and scene based on information extraction generate the storage medium of system |
CN109978550A (en) * | 2019-03-12 | 2019-07-05 | 同济大学 | A kind of credible electronic transaction clearance mechanism based on generation confrontation network |
CN110533588A (en) * | 2019-07-16 | 2019-12-03 | 中国农业大学 | Based on the root system image repair method for generating confrontation network |
CN110533588B (en) * | 2019-07-16 | 2021-09-21 | 中国农业大学 | Root system image restoration method based on generation of countermeasure network |
CN110533074B (en) * | 2019-07-30 | 2022-03-29 | 华南理工大学 | Automatic image category labeling method and system based on double-depth neural network |
CN110533074A (en) * | 2019-07-30 | 2019-12-03 | 华南理工大学 | A kind of picture classification automatic marking method and system based on dual-depth neural network |
CN110889469B (en) * | 2019-09-19 | 2023-07-21 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
CN110889469A (en) * | 2019-09-19 | 2020-03-17 | 北京市商汤科技开发有限公司 | Image processing method and device, electronic equipment and storage medium |
US11514694B2 (en) | 2019-09-20 | 2022-11-29 | Samsung Electronics Co., Ltd. | Teaching GAN (generative adversarial networks) to generate per-pixel annotation |
RU2735148C1 (en) * | 2019-12-09 | 2020-10-28 | Самсунг Электроникс Ко., Лтд. | Training gan (generative adversarial networks) to create pixel-by-pixel annotation |
CN111143617A (en) * | 2019-12-12 | 2020-05-12 | 浙江大学 | Automatic generation method and system for picture or video text description |
CN112347742B (en) * | 2020-10-29 | 2022-05-31 | 青岛科技大学 | Method for generating document image set based on deep learning |
CN112347742A (en) * | 2020-10-29 | 2021-02-09 | 青岛科技大学 | Method for generating document image set based on deep learning |
CN112818159A (en) * | 2021-02-24 | 2021-05-18 | 上海交通大学 | Image description text generation method based on generation countermeasure network |
CN113077013A (en) * | 2021-04-28 | 2021-07-06 | 上海联麓半导体技术有限公司 | High-dimensional data fault anomaly detection method and system based on generation countermeasure network |
CN114241263A (en) * | 2021-12-17 | 2022-03-25 | 电子科技大学 | Radar interference semi-supervised open set identification system based on generation countermeasure network |
CN114241263B (en) * | 2021-12-17 | 2023-05-02 | 电子科技大学 | Radar interference semi-supervised open set recognition system based on generation of countermeasure network |
CN116795972A (en) * | 2023-08-11 | 2023-09-22 | 之江实验室 | Model training method and device, storage medium and electronic equipment |
CN116795972B (en) * | 2023-08-11 | 2024-01-09 | 之江实验室 | Model training method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107330444A (en) | A kind of image autotext mask method based on generation confrontation network | |
CN106778506A (en) | A kind of expression recognition method for merging depth image and multi-channel feature | |
CN110443231A (en) | A kind of fingers of single hand point reading character recognition method and system based on artificial intelligence | |
CN110175251A (en) | The zero sample Sketch Searching method based on semantic confrontation network | |
CN107506722A (en) | One kind is based on depth sparse convolution neutral net face emotion identification method | |
CN106202044A (en) | A kind of entity relation extraction method based on deep neural network | |
CN108416065A (en) | Image based on level neural network-sentence description generates system and method | |
CN108984530A (en) | A kind of detection method and detection system of network sensitive content | |
CN110516539A (en) | Remote sensing image building extracting method, system, storage medium and equipment based on confrontation network | |
CN110458003B (en) | Facial expression action unit countermeasure synthesis method based on local attention model | |
CN107392147A (en) | A kind of image sentence conversion method based on improved production confrontation network | |
CN108182409A (en) | Biopsy method, device, equipment and storage medium | |
CN110009057A (en) | A kind of graphical verification code recognition methods based on deep learning | |
CN106875007A (en) | End-to-end deep neural network is remembered based on convolution shot and long term for voice fraud detection | |
CN107145514B (en) | Chinese sentence pattern classification method based on decision tree and SVM mixed model | |
CN110532912A (en) | A kind of sign language interpreter implementation method and device | |
CN108765383A (en) | Video presentation method based on depth migration study | |
CN112541529A (en) | Expression and posture fusion bimodal teaching evaluation method, device and storage medium | |
CN109934204A (en) | A kind of facial expression recognizing method based on convolutional neural networks | |
CN107066979A (en) | A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks | |
CN113642621A (en) | Zero sample image classification method based on generation countermeasure network | |
CN109711356A (en) | A kind of expression recognition method and system | |
CN112069993B (en) | Dense face detection method and system based on five-sense organ mask constraint and storage medium | |
CN109670559A (en) | Recognition methods, device, equipment and the storage medium of handwritten Chinese character | |
CN109871898A (en) | A method of deposit training sample is generated using confrontation network is generated |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171107 |
|
RJ01 | Rejection of invention patent application after publication |