CN107807971A - A kind of automated graphics semantic description method - Google Patents

A kind of automated graphics semantic description method Download PDF

Info

Publication number
CN107807971A
CN107807971A CN201710969647.9A CN201710969647A CN107807971A CN 107807971 A CN107807971 A CN 107807971A CN 201710969647 A CN201710969647 A CN 201710969647A CN 107807971 A CN107807971 A CN 107807971A
Authority
CN
China
Prior art keywords
semantic description
automated graphics
gru
description method
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710969647.9A
Other languages
Chinese (zh)
Inventor
吕学强
董志安
李卓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN201710969647.9A priority Critical patent/CN107807971A/en
Publication of CN107807971A publication Critical patent/CN107807971A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of automated graphics semantic description method, including the automated graphics semantic description model based on CNN and GRU is built and trained, be specially:Step 1) objective function;Step 2) carries out the process translated from image to semantic description;Step 3) carries out reverse propagation to error.Automated graphics semantic description method provided by the invention, input using certain layer of full articulamentum feature of CNN extractions as GRU models, the low-level image feature and image, semantic of effective integration image describe high-layer semantic information, precision is high, the degree of accuracy is high, just reach higher semantic description precision using less parameter, the needs of practical application can be met well.

Description

A kind of automated graphics semantic description method
Technical field
The invention belongs to image, semantic description technique field, and in particular to a kind of automated graphics semantic description method.
Background technology
In recent years, the mankind are directed to the research that computer is understood image, semantic always.With computer hardware technique Development, automated graphics semantic description technology turns into study hotspot in recent years.Automated graphics semantic description is not only appreciated that figure Entity as in, and is appreciated that the event described in image, scene etc., be to image, semantic deeper into understanding.Now Automated graphics semantic description be also in the starting stage, due to " semantic gap " problem existing for image and natural language sheet The problem complicated and changeable of the syntactic structure of body, computer can not realize the accurate description to image, semantic information all the time.With Computer hardware in recent years and depth learning technology developing rapidly in image domains, increasing researcher throws Enter among the research of automated graphics semantic description.Depth learning technology such as convolutional neural networks model is artificial compared to other The characteristics of image of design has the ability for preferably extracting characteristics of image, but depth learning technology needs to rely on stronger calculating Ability, in recent years the depth learning technology that develops into of the hardware facility such as computer hardware such as GPU provide powerful calculating branch Hold, this causes this more difficult work of automated graphics semantic description to turn into the study hotspot of field of machine vision instantly.
Just move towards intellectualization times in the world of today.Increasing WeiLai Technology, such as unmanned, intelligent robot Deng gradually entering into the popular visual field.Unmanned to need computer to carry out automatic understanding to traffic information, intelligent robot needs Design and simulation eye simulates the function of human eye and human brain to identify the object of surrounding.These all rely on depth of the computer to image Enter to understand.Picture material automatic describing into natural language, is then subsequently understood image by automated graphics semantic description for computer Content.Therefore automated graphics semantic description is the supportive technology of following intellectualization times, has important Research Significance and business With value.The research of automated graphics semantic description is in the development starting stage.The achievement in research obtained at present is also few.One Aspect is that image " semantic gap " problem is not solved effectively at this stage because image description content itself is complex Certainly, objects in images accuracy of identification is not high.Another aspect automated graphics semantic description is that picture material is described as into nature language Speech, but fixed and clause is not more for natural language form itself.How picture material is expressed as abundant in content, clause Changeable natural language is highly difficult and has very much a challenge.Automated graphics semantic description routine thinking is first in image Entity mark vocabulary, then word combination is formed a complete sentence using language model.Because the content that image includes in itself is more rich Richness, and some objects on image may be capped or imperfect, the object after image is split can not by it is accurate know mark Note, this directly results in image, semantic, and to describe precision not high.And the semantic description content-form of this method is more single, knot Structure is relatively simple, and the understanding to image, semantic is not accurate comprehensive enough.Convolutional neural networks (CNN) are applied to extraction in recent years On characteristics of image, in the prior art, using the input through the characteristics of image that CNN is extracted as Recognition with Recurrent Neural Network (RNN), figure As output of the semantic description information as RNN, by image, semantic describe problem regard as it is translated from image to semantic description Journey, construct the automated graphics semantic description model based on CNN and RNN.But understanding of this method to image, semantic is accurate Degree is not high, and the sentence marked using the model is not clear and coherent enough, and the accuracy of marked content is not high.
The content of the invention
For above-mentioned problems of the prior art, can be avoided the occurrence of it is an object of the invention to provide one kind above-mentioned The automated graphics semantic description method of technological deficiency.
In order to realize foregoing invention purpose, technical scheme provided by the invention is as follows:
A kind of automated graphics semantic description method, including build and train the automated graphics semanteme based on CNN and GRU to retouch Model is stated, is specially:
Step 1) objective function;
Step 2) carries out the process translated from image to semantic description;
Step 3) carries out reverse propagation to error.
Further, the object function in step 1) is
Wherein θ represents parameter all in the model, and I represents piece image, S=(S0... SN) represent what is finally predicted Combinations of words, i.e., final semantic description.
Further, the step 2) is as shown by the following formula:
x-1=CNN (I);
xt=West, t ∈ 0 ... N-1 };
ht=GRU (xt), t ∈ 0 ... N-1 };
pt+1=g (Wpht);
Wherein, I represents piece image, S=(s0, s1, s2... sn) the complete semantic description of diagram picture is represented, by N word composition.stUsing one-hot coding form;Wherein s0It is a special words " start ", represents the beginning of a word; snIt is a special words " end ", represents the end of a word.
Further, the step 3) includes:
Define loss function:The loss function is that prediction of all moment word is correct The summation of log probable values after take the result of negative, i.e. cross entropy loss function;
By training the parameter constantly updated in model so that penalty values are as far as possible small;
The parameter is updated using stochastic gradient descent method and chain type Rule for derivation.
Further, the parameter includes GRU models inner parameter, term vector coding parameter, characteristics of image coding ginseng Number, output decoding parametric.
Further, in the training process of model, the weighting parameters of the GRU networks at each moment be all it is shared, on The output of one moment GRU network, the part as current time GRU network input.
Further, CNN includes two kinds of hidden layer structure of convolutional layer and pond layer.
Further, do not connected entirely between CNN next layer of neuron and last layer neuron, i.e. its neuron Between be local sensing;There is identical weight, i.e. the connection of neuron is weight in another aspect neuron connection procedure Shared.
Further, exist in GRU structure and reset thresholding, it is expressed asWherein σ is activation Function, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers,Thresholding input is reset for t Layer weights,Thresholding hidden layer weights are reset for t.
Further, renewal door in GRU structure be present, it can be expressed as formula:
Wherein σ is activation primitive, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers, Thresholding input layer weights are updated for t,Thresholding hidden layer weights are updated for t.
Automated graphics semantic description method provided by the invention, using the full articulamentum feature of certain layer of CNN extractions as GRU The input of model, the low-level image feature and image, semantic of effective integration image describe high-layer semantic information, and precision is high, and the degree of accuracy is high, Just reach higher semantic description precision using less parameter, the needs of practical application can be met well.
Brief description of the drawings
Fig. 1 is the structure of the present invention and the step flow for training the automated graphics semantic description model based on CNN and GRU Figure;
Fig. 2 is the automated graphics semantic description model structure schematic diagram based on CNN and GRU;
Fig. 3 is traditional neural network model basic structure schematic diagram;
Fig. 4 is RNN neural network model conventional structure schematic diagrames;
Fig. 5 is GRU model structure schematic diagrames.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with the accompanying drawings and specific implementation The present invention will be further described for example.It should be appreciated that specific embodiment described herein is only to explain the present invention, not For limiting the present invention.Based on the embodiment in the present invention, those of ordinary skill in the art are not before creative work is made The every other embodiment obtained is put, belongs to the scope of protection of the invention.
A kind of automated graphics semantic description method, including build and train the automated graphics semanteme based on CNN and GRU to retouch The step of stating model, shown in reference picture 1, the step of building and train the automated graphics semantic description model based on CNN and GRU Specifically include following steps:
Objective function:
Wherein θ represents parameter all in the model, and I represents piece image, S=(S0... SN) represent what is finally predicted Combinations of words, i.e., final semantic description.
The process translated from image to semantic description is carried out, as shown by the following formula:
x-1=CNN (I) (2);
xt=West, t ∈ 0 ... N-1 } (3);
ht=GRU (xt), t ∈ 0 ... N-1 } (4);
pt+1=g (Wpht) (5);
Wherein, I represents piece image, S=(s0, s1, s2... sn) the complete semantic description of diagram picture is represented, by N word composition.stUsing one-hot coding, (one hot, in addition to being 1 except a certain position, remaining position is 0.It is by a N-dimensional vector structure Into N represents the number of words in word lexicon) form.Wherein s0It is a special words " start ", represents opening for a word Begin.snIt is a special words " end ", represents the end of a word.Only in t=-1 moment input picture features to GRU nets In network, the moment, each word in semantic description S corresponding to input picture was into GRU networks in order afterwards, during to ensure t It is consistent, it is necessary to the characteristics of image inputted to the t=-1 moment and moment input afterwards to carve the information dimension being input in GRU networks Word stEncoded, stNeed by word weight parameter WeCoding, characteristics of image are needed by Image Coding parameter WLCompile Code.Since the t=0 moment, the output h of each moment GRU modelstBy exporting decoding parametric WpPass through softmax after decoding again Grader can obtain a prediction result pt+1(such as shown in formula (5)), i.e. each moment is produced in semantic description in order Current time GRU mode input word next word.This prediction result and current time GRU mode input word Next correct word gap be present, need to carry out error reverse propagation in the training process.
It is defined as follows loss function:
It is the backpropagation of error to model training process, updates the process of model parameter.Damage as shown in formula (6) It is the result that negative is taken after the correct log probable values of prediction of all moment word are summed to lose function, that is, intersects entropy loss letter Number.By training the parameter constantly updated in model so that penalty values are as far as possible small.These parameters include joining inside GRU models Number, term vector coding parameter, characteristics of image coding parameter, output decoding parametric etc..What the renewal for these parameters was applied to Method is stochastic gradient descent method (SGD) and chain type Rule for derivation.In the training process of model, the GRU networks at each moment Weighting parameter be all shared.The output of last moment GRU network, the part input (tool as current time GRU network Shown in body such as formula (1)-(4)).
Convolutional neural networks (CNN) include two kinds of unique hidden layer structure of convolutional layer and pond layer.Recently as meter The development of calculation machine hardware (CPU, GPU), the calculating performance of computer are greatly improved.Neutral net mould more complicated CNN etc. The calculating performance that type relies on computer powerful is increasingly becoming the study hotspot of numerous research fields.There is CNN preferable feature to carry Ability is taken, it is widely used in the fields such as image, video, voice at present.
CNN has unique network structure.Its uniqueness is mainly reflected in two aspects:It is the next of it on one side Do not connected entirely between layer neuron and last layer neuron, i.e., be local sensing between its neuron;On the other hand god Through having identical weight in first connection procedure, i.e. the connection of neuron is that weight is shared.This unique local sensing and The shared network structure of weight approaches with biological neural network.Such model can effectively reduce the parameter in network, effectively Reduce the complexity of network.CNN has two kinds of unique hidden layer structures, i.e. convolutional layer and pond layer.A certain layer convolution in CNN Layer is made up of a variety of convolution kernels, and a convolution kernel is the wave filter of a M*M size, and it is used for extracting in last layer receptive field Certain local feature of each local location.Pond layer be used for last layer convolution feature carry out dimensionality reduction, concrete operations be by Last layer convolution feature is divided into multiple N*N region.The characteristic value of average (or maximum) in each region is extracted as dimensionality reduction Feature afterwards.CNN generally would generally access a softmax points after a series of convolutional layers, pond layer, full articulamentum Class device, for handling more classification problems.
Recognition with Recurrent Neural Network (Recurrent Neural Network, hereinafter referred to as RNN) is the one of neural network model Kind, because it has unique memory function structure, it is applied in Machine Translation Model.Neural network model includes defeated Enter layer, hidden layer, output layer three-decker.In traditional neural network model, such as previously described convolutional neural networks mould It is connectionless per interior nodes from level to level, the node between each layer is to deposit to output layer from input layer to hidden layer among type It is as shown in Figure 3 in connection, concrete structure.This traditional neural network model and the function not comprising recall info, as one It is helpless to need a bit by the problem of information is calculated has been produced.If for example, in short, to predict down The word of one appearance, need in most cases by above caused vocabulary, for example " I is a basketball movement Member, I likes to play basketball " so in short, " playing basketball " inside latter sentence is needed by " the basketball movement in last sentence Member " is inferred to.RNN models information caused by the moment can will be remembered before and be applied to current time calculating process In, this has benefited from RNN compared to the change that traditional neural network model occurs in structure, and the input of RNN hidden layer is not The output of current time input layer is only included, also includes the output information of last moment hidden layer, i.e., the node inside hidden layer There is a connection, concrete structure information is as shown in Figure 4.
GRU is that RNN models are improved, it with can be solved as LSTM models existing for RNN models it is long-term Dependence Problem, it is less compared to parameter for LSTM models inside its model, more it is not easy over-fitting in training process, and Training speed is very fast.The specific structures of GRU are as shown in Figure 5:
Reference picture 5, a thresholding r be present, referred to as reset thresholding, it can be expressed as formula:
Wherein σ is activation primitive, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers, Thresholding input layer weights are reset for t,Thresholding hidden layer weights are reset for t.
Another thresholding z in Fig. 5 be present, referred to as update door, it can be expressed as formula:
Wherein σ is activation primitive, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers, Thresholding input layer weights are updated for t,Thresholding hidden layer weights are updated for t.
From figure 5 it can be seen that the output information of t hidden layer can be expressed as:
Wherein ztThreshold conditon information, h are updated for tt-1It is the output information of t-1 moment hidden layers,It is t Hidden layer status information, its specific formula is such as shown in (3.4):
Wherein φ is activation primitive, and ⊙ is that the dot product of matrix operates, WtFor t input layer weights, UtImplied for the t moment Layer weights, rtThreshold conditon information, h are reset for tt-1It is the output information of t-1 moment hidden layers.
By formula (3.4) it can be seen that when thresholding r is reset close to 0, htLatter close to 0, i.e. t It is input layer information there was only current time in the hidden state at moment, and last moment hidden layer output information will be ignored.This Kind setting can allow for hidden layer to abandon to the incoherent information of later point so that the information stayed is more meaningful.It is another Aspect, renewal thresholding are used to control the choice between last moment hidden layer information and current time hidden state information, passed through Formula (3.3) is it can be seen that work as ZtWhen taking 1, t hidden layer only exports last moment hidden layer information, works as ZtWhen taking 0, during t Carve hidden layer and only export t hidden layer status information.The replacement thresholding at each moment and renewal thresholding be it is separate, Therefore when needing recent information, reset thresholding and be at active state.When needing long-range information, renewal goalkeeper can be in Active state.
Automated graphics semantic description method provided by the invention, using the full articulamentum feature of certain layer of CNN extractions as GRU The input of model, the low-level image feature and image, semantic of effective integration image describe high-layer semantic information, and precision is high, and the degree of accuracy is high, Just reach higher semantic description precision using less parameter, the needs of practical application can be met well.
Embodiment described above only expresses embodiments of the present invention, and its description is more specific and detailed, but can not Therefore it is interpreted as the limitation to the scope of the claims of the present invention.It should be pointed out that come for one of ordinary skill in the art Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention Protect scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims (10)

1. a kind of automated graphics semantic description method, it is characterised in that including building and training based on CNN and GRU from cardon As semantic description model, it is specially:
Step 1) objective function;
Step 2) carries out the process translated from image to semantic description;
Step 3) carries out reverse propagation to error.
2. automated graphics semantic description method according to claim 1, it is characterised in that the object function in step 1) is
3. the automated graphics semantic description method according to claim 1-2, it is characterised in that the step 2) such as following public affairs Shown in formula:
x-1=CNN (I);
xt=West, t ∈ 0 ... N-1 };
ht=GRU (xt), t ∈ 0 ... N-1 };
pt+1=g (Wpht);
Wherein, I represents piece image, S=(s0, s1, s2... sn) the complete semantic description of diagram picture is represented, it is single by n Word forms.stUsing one-hot coding form;Wherein s0It is a special words " start ", represents the beginning of a word;snIt is one Individual special words " end ", represent the end of a word.
4. automated graphics semantic description method according to claim 1, it is characterised in that the step 3 includes:
Define loss function:The loss function is by the correct log of prediction word of all moment The result of negative, i.e. cross entropy loss function are taken after probable value summation;
By training the parameter constantly updated in model so that penalty values are as far as possible small;
The parameter is updated using stochastic gradient descent method and chain type Rule for derivation.
5. automated graphics semantic description method according to claim 1, it is characterised in that the parameter includes GRU models Inner parameter, term vector coding parameter, characteristics of image coding parameter, output decoding parametric.
6. the automated graphics semantic description method according to claim 1-5, it is characterised in that in the training process of model In, the weighting parameter of the GRU networks at each moment is all shared, the output of last moment GRU network, as current time The part input of GRU networks.
7. the automated graphics semantic description method according to claim 1-6, it is characterised in that CNN includes convolutional layer and pond Change two kinds of hidden layer structure of layer.
8. the automated graphics semantic description method according to claim 1-7, it is characterised in that CNN next layer of neuron It is not connected entirely between last layer neuron, i.e., is local sensing between its neuron;Another aspect neuron connects During there is identical weight, i.e. the connection of neuron is that weight is shared.
9. the automated graphics semantic description method according to claim 1-8, it is characterised in that weight in GRU structure be present Thresholding is put, it is expressed asWherein σ is activation primitive, xtIt is the input information of t, ht-1When being t-1 The output information of hidden layer is carved,Thresholding input layer weights are reset for t,Thresholding hidden layer weights are reset for t.
10. the automated graphics semantic description method according to claim 1-9, it is characterised in that exist more in GRU structure New door, it can be expressed as formula:
CN201710969647.9A 2017-10-18 2017-10-18 A kind of automated graphics semantic description method Pending CN107807971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710969647.9A CN107807971A (en) 2017-10-18 2017-10-18 A kind of automated graphics semantic description method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710969647.9A CN107807971A (en) 2017-10-18 2017-10-18 A kind of automated graphics semantic description method

Publications (1)

Publication Number Publication Date
CN107807971A true CN107807971A (en) 2018-03-16

Family

ID=61585067

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710969647.9A Pending CN107807971A (en) 2017-10-18 2017-10-18 A kind of automated graphics semantic description method

Country Status (1)

Country Link
CN (1) CN107807971A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416059A (en) * 2018-03-22 2018-08-17 北京市商汤科技开发有限公司 Training method and device, equipment, medium, the program of image description model
CN108710902A (en) * 2018-05-08 2018-10-26 江苏云立物联科技有限公司 A kind of sorting technique towards high-resolution remote sensing image based on artificial intelligence
CN108764299A (en) * 2018-05-04 2018-11-06 北京物灵智能科技有限公司 Story model training and generation method, system, robot and storage device
CN108830287A (en) * 2018-04-18 2018-11-16 哈尔滨理工大学 The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method
CN109271926A (en) * 2018-09-14 2019-01-25 西安电子科技大学 Intelligent Radiation source discrimination based on GRU depth convolutional network
CN109447244A (en) * 2018-10-11 2019-03-08 中山大学 A kind of advertisement recommended method of combination gating cycle unit neural network
CN109523993A (en) * 2018-11-02 2019-03-26 成都三零凯天通信实业有限公司 A kind of voice languages classification method merging deep neural network with GRU based on CNN
CN109558838A (en) * 2018-11-29 2019-04-02 北京经纬恒润科技有限公司 A kind of object identification method and system
CN109710787A (en) * 2018-12-30 2019-05-03 陕西师范大学 Image Description Methods based on deep learning
CN110232413A (en) * 2019-05-31 2019-09-13 华北电力大学(保定) Insulator image, semantic based on GRU network describes method, system, device
CN110889430A (en) * 2019-10-24 2020-03-17 中国科学院计算技术研究所 News image detection method, system and device based on multi-domain visual features

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106901723A (en) * 2017-04-20 2017-06-30 济南浪潮高新科技投资发展有限公司 A kind of electrocardiographic abnormality automatic diagnosis method
US20170255832A1 (en) * 2016-03-02 2017-09-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Detecting Actions in Videos
CN107229967A (en) * 2016-08-22 2017-10-03 北京深鉴智能科技有限公司 A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170255832A1 (en) * 2016-03-02 2017-09-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Detecting Actions in Videos
CN107229967A (en) * 2016-08-22 2017-10-03 北京深鉴智能科技有限公司 A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA
CN106901723A (en) * 2017-04-20 2017-06-30 济南浪潮高新科技投资发展有限公司 A kind of electrocardiographic abnormality automatic diagnosis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ORIOL VINYALS 等: "Show and Tell: A Neural Image Caption Generator", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) 》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416059A (en) * 2018-03-22 2018-08-17 北京市商汤科技开发有限公司 Training method and device, equipment, medium, the program of image description model
CN108830287A (en) * 2018-04-18 2018-11-16 哈尔滨理工大学 The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method
CN108764299B (en) * 2018-05-04 2020-10-23 北京物灵智能科技有限公司 Story model training and generating method and system, robot and storage device
CN108764299A (en) * 2018-05-04 2018-11-06 北京物灵智能科技有限公司 Story model training and generation method, system, robot and storage device
CN108710902A (en) * 2018-05-08 2018-10-26 江苏云立物联科技有限公司 A kind of sorting technique towards high-resolution remote sensing image based on artificial intelligence
CN109271926A (en) * 2018-09-14 2019-01-25 西安电子科技大学 Intelligent Radiation source discrimination based on GRU depth convolutional network
CN109447244A (en) * 2018-10-11 2019-03-08 中山大学 A kind of advertisement recommended method of combination gating cycle unit neural network
CN109523993A (en) * 2018-11-02 2019-03-26 成都三零凯天通信实业有限公司 A kind of voice languages classification method merging deep neural network with GRU based on CNN
CN109523993B (en) * 2018-11-02 2022-02-08 深圳市网联安瑞网络科技有限公司 Voice language classification method based on CNN and GRU fusion deep neural network
CN109558838A (en) * 2018-11-29 2019-04-02 北京经纬恒润科技有限公司 A kind of object identification method and system
CN109710787A (en) * 2018-12-30 2019-05-03 陕西师范大学 Image Description Methods based on deep learning
CN109710787B (en) * 2018-12-30 2023-03-28 陕西师范大学 Image description method based on deep learning
CN110232413A (en) * 2019-05-31 2019-09-13 华北电力大学(保定) Insulator image, semantic based on GRU network describes method, system, device
CN110889430A (en) * 2019-10-24 2020-03-17 中国科学院计算技术研究所 News image detection method, system and device based on multi-domain visual features

Similar Documents

Publication Publication Date Title
CN107807971A (en) A kind of automated graphics semantic description method
CN111985245B (en) Relationship extraction method and system based on attention cycle gating graph convolution network
CN110609891B (en) Visual dialog generation method based on context awareness graph neural network
Cheng et al. Facial expression recognition method based on improved VGG convolutional neural network
CN108875807B (en) Image description method based on multiple attention and multiple scales
CN110288665B (en) Image description method based on convolutional neural network, computer-readable storage medium and electronic device
CN108830287A (en) The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method
CN110390397B (en) Text inclusion recognition method and device
CN110134946B (en) Machine reading understanding method for complex data
CN112990296B (en) Image-text matching model compression and acceleration method and system based on orthogonal similarity distillation
CN112163425B (en) Text entity relation extraction method based on multi-feature information enhancement
CN110222163A (en) A kind of intelligent answer method and system merging CNN and two-way LSTM
CN111008293A (en) Visual question-answering method based on structured semantic representation
CN108197294A (en) A kind of text automatic generation method based on deep learning
CN110826338B (en) Fine-grained semantic similarity recognition method for single-selection gate and inter-class measurement
CN111177376A (en) Chinese text classification method based on BERT and CNN hierarchical connection
CN107766320A (en) A kind of Chinese pronoun resolution method for establishing model and device
CN109597876A (en) A kind of more wheels dialogue answer preference pattern and its method based on intensified learning
CN109710744A (en) A kind of data matching method, device, equipment and storage medium
CN112036276A (en) Artificial intelligent video question-answering method
CN111368142A (en) Video intensive event description method based on generation countermeasure network
CN113641819A (en) Multi-task sparse sharing learning-based argument mining system and method
CN111428481A (en) Entity relation extraction method based on deep learning
CN108805260A (en) A kind of figure says generation method and device
CN115511069A (en) Neural network training method, data processing method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180316

WD01 Invention patent application deemed withdrawn after publication