CN107807971A - A kind of automated graphics semantic description method - Google Patents
A kind of automated graphics semantic description method Download PDFInfo
- Publication number
- CN107807971A CN107807971A CN201710969647.9A CN201710969647A CN107807971A CN 107807971 A CN107807971 A CN 107807971A CN 201710969647 A CN201710969647 A CN 201710969647A CN 107807971 A CN107807971 A CN 107807971A
- Authority
- CN
- China
- Prior art keywords
- semantic description
- automated graphics
- gru
- description method
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of automated graphics semantic description method, including the automated graphics semantic description model based on CNN and GRU is built and trained, be specially:Step 1) objective function;Step 2) carries out the process translated from image to semantic description;Step 3) carries out reverse propagation to error.Automated graphics semantic description method provided by the invention, input using certain layer of full articulamentum feature of CNN extractions as GRU models, the low-level image feature and image, semantic of effective integration image describe high-layer semantic information, precision is high, the degree of accuracy is high, just reach higher semantic description precision using less parameter, the needs of practical application can be met well.
Description
Technical field
The invention belongs to image, semantic description technique field, and in particular to a kind of automated graphics semantic description method.
Background technology
In recent years, the mankind are directed to the research that computer is understood image, semantic always.With computer hardware technique
Development, automated graphics semantic description technology turns into study hotspot in recent years.Automated graphics semantic description is not only appreciated that figure
Entity as in, and is appreciated that the event described in image, scene etc., be to image, semantic deeper into understanding.Now
Automated graphics semantic description be also in the starting stage, due to " semantic gap " problem existing for image and natural language sheet
The problem complicated and changeable of the syntactic structure of body, computer can not realize the accurate description to image, semantic information all the time.With
Computer hardware in recent years and depth learning technology developing rapidly in image domains, increasing researcher throws
Enter among the research of automated graphics semantic description.Depth learning technology such as convolutional neural networks model is artificial compared to other
The characteristics of image of design has the ability for preferably extracting characteristics of image, but depth learning technology needs to rely on stronger calculating
Ability, in recent years the depth learning technology that develops into of the hardware facility such as computer hardware such as GPU provide powerful calculating branch
Hold, this causes this more difficult work of automated graphics semantic description to turn into the study hotspot of field of machine vision instantly.
Just move towards intellectualization times in the world of today.Increasing WeiLai Technology, such as unmanned, intelligent robot
Deng gradually entering into the popular visual field.Unmanned to need computer to carry out automatic understanding to traffic information, intelligent robot needs
Design and simulation eye simulates the function of human eye and human brain to identify the object of surrounding.These all rely on depth of the computer to image
Enter to understand.Picture material automatic describing into natural language, is then subsequently understood image by automated graphics semantic description for computer
Content.Therefore automated graphics semantic description is the supportive technology of following intellectualization times, has important Research Significance and business
With value.The research of automated graphics semantic description is in the development starting stage.The achievement in research obtained at present is also few.One
Aspect is that image " semantic gap " problem is not solved effectively at this stage because image description content itself is complex
Certainly, objects in images accuracy of identification is not high.Another aspect automated graphics semantic description is that picture material is described as into nature language
Speech, but fixed and clause is not more for natural language form itself.How picture material is expressed as abundant in content, clause
Changeable natural language is highly difficult and has very much a challenge.Automated graphics semantic description routine thinking is first in image
Entity mark vocabulary, then word combination is formed a complete sentence using language model.Because the content that image includes in itself is more rich
Richness, and some objects on image may be capped or imperfect, the object after image is split can not by it is accurate know mark
Note, this directly results in image, semantic, and to describe precision not high.And the semantic description content-form of this method is more single, knot
Structure is relatively simple, and the understanding to image, semantic is not accurate comprehensive enough.Convolutional neural networks (CNN) are applied to extraction in recent years
On characteristics of image, in the prior art, using the input through the characteristics of image that CNN is extracted as Recognition with Recurrent Neural Network (RNN), figure
As output of the semantic description information as RNN, by image, semantic describe problem regard as it is translated from image to semantic description
Journey, construct the automated graphics semantic description model based on CNN and RNN.But understanding of this method to image, semantic is accurate
Degree is not high, and the sentence marked using the model is not clear and coherent enough, and the accuracy of marked content is not high.
The content of the invention
For above-mentioned problems of the prior art, can be avoided the occurrence of it is an object of the invention to provide one kind above-mentioned
The automated graphics semantic description method of technological deficiency.
In order to realize foregoing invention purpose, technical scheme provided by the invention is as follows:
A kind of automated graphics semantic description method, including build and train the automated graphics semanteme based on CNN and GRU to retouch
Model is stated, is specially:
Step 1) objective function;
Step 2) carries out the process translated from image to semantic description;
Step 3) carries out reverse propagation to error.
Further, the object function in step 1) is
Wherein θ represents parameter all in the model, and I represents piece image, S=(S0... SN) represent what is finally predicted
Combinations of words, i.e., final semantic description.
Further, the step 2) is as shown by the following formula:
x-1=CNN (I);
xt=West, t ∈ 0 ... N-1 };
ht=GRU (xt), t ∈ 0 ... N-1 };
pt+1=g (Wpht);
Wherein, I represents piece image, S=(s0, s1, s2... sn) the complete semantic description of diagram picture is represented, by
N word composition.stUsing one-hot coding form;Wherein s0It is a special words " start ", represents the beginning of a word;
snIt is a special words " end ", represents the end of a word.
Further, the step 3) includes:
Define loss function:The loss function is that prediction of all moment word is correct
The summation of log probable values after take the result of negative, i.e. cross entropy loss function;
By training the parameter constantly updated in model so that penalty values are as far as possible small;
The parameter is updated using stochastic gradient descent method and chain type Rule for derivation.
Further, the parameter includes GRU models inner parameter, term vector coding parameter, characteristics of image coding ginseng
Number, output decoding parametric.
Further, in the training process of model, the weighting parameters of the GRU networks at each moment be all it is shared, on
The output of one moment GRU network, the part as current time GRU network input.
Further, CNN includes two kinds of hidden layer structure of convolutional layer and pond layer.
Further, do not connected entirely between CNN next layer of neuron and last layer neuron, i.e. its neuron
Between be local sensing;There is identical weight, i.e. the connection of neuron is weight in another aspect neuron connection procedure
Shared.
Further, exist in GRU structure and reset thresholding, it is expressed asWherein σ is activation
Function, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers,Thresholding input is reset for t
Layer weights,Thresholding hidden layer weights are reset for t.
Further, renewal door in GRU structure be present, it can be expressed as formula:
Wherein σ is activation primitive, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers,
Thresholding input layer weights are updated for t,Thresholding hidden layer weights are updated for t.
Automated graphics semantic description method provided by the invention, using the full articulamentum feature of certain layer of CNN extractions as GRU
The input of model, the low-level image feature and image, semantic of effective integration image describe high-layer semantic information, and precision is high, and the degree of accuracy is high,
Just reach higher semantic description precision using less parameter, the needs of practical application can be met well.
Brief description of the drawings
Fig. 1 is the structure of the present invention and the step flow for training the automated graphics semantic description model based on CNN and GRU
Figure;
Fig. 2 is the automated graphics semantic description model structure schematic diagram based on CNN and GRU;
Fig. 3 is traditional neural network model basic structure schematic diagram;
Fig. 4 is RNN neural network model conventional structure schematic diagrames;
Fig. 5 is GRU model structure schematic diagrames.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with the accompanying drawings and specific implementation
The present invention will be further described for example.It should be appreciated that specific embodiment described herein is only to explain the present invention, not
For limiting the present invention.Based on the embodiment in the present invention, those of ordinary skill in the art are not before creative work is made
The every other embodiment obtained is put, belongs to the scope of protection of the invention.
A kind of automated graphics semantic description method, including build and train the automated graphics semanteme based on CNN and GRU to retouch
The step of stating model, shown in reference picture 1, the step of building and train the automated graphics semantic description model based on CNN and GRU
Specifically include following steps:
Objective function:
Wherein θ represents parameter all in the model, and I represents piece image, S=(S0... SN) represent what is finally predicted
Combinations of words, i.e., final semantic description.
The process translated from image to semantic description is carried out, as shown by the following formula:
x-1=CNN (I) (2);
xt=West, t ∈ 0 ... N-1 } (3);
ht=GRU (xt), t ∈ 0 ... N-1 } (4);
pt+1=g (Wpht) (5);
Wherein, I represents piece image, S=(s0, s1, s2... sn) the complete semantic description of diagram picture is represented, by
N word composition.stUsing one-hot coding, (one hot, in addition to being 1 except a certain position, remaining position is 0.It is by a N-dimensional vector structure
Into N represents the number of words in word lexicon) form.Wherein s0It is a special words " start ", represents opening for a word
Begin.snIt is a special words " end ", represents the end of a word.Only in t=-1 moment input picture features to GRU nets
In network, the moment, each word in semantic description S corresponding to input picture was into GRU networks in order afterwards, during to ensure t
It is consistent, it is necessary to the characteristics of image inputted to the t=-1 moment and moment input afterwards to carve the information dimension being input in GRU networks
Word stEncoded, stNeed by word weight parameter WeCoding, characteristics of image are needed by Image Coding parameter WLCompile
Code.Since the t=0 moment, the output h of each moment GRU modelstBy exporting decoding parametric WpPass through softmax after decoding again
Grader can obtain a prediction result pt+1(such as shown in formula (5)), i.e. each moment is produced in semantic description in order
Current time GRU mode input word next word.This prediction result and current time GRU mode input word
Next correct word gap be present, need to carry out error reverse propagation in the training process.
It is defined as follows loss function:
It is the backpropagation of error to model training process, updates the process of model parameter.Damage as shown in formula (6)
It is the result that negative is taken after the correct log probable values of prediction of all moment word are summed to lose function, that is, intersects entropy loss letter
Number.By training the parameter constantly updated in model so that penalty values are as far as possible small.These parameters include joining inside GRU models
Number, term vector coding parameter, characteristics of image coding parameter, output decoding parametric etc..What the renewal for these parameters was applied to
Method is stochastic gradient descent method (SGD) and chain type Rule for derivation.In the training process of model, the GRU networks at each moment
Weighting parameter be all shared.The output of last moment GRU network, the part input (tool as current time GRU network
Shown in body such as formula (1)-(4)).
Convolutional neural networks (CNN) include two kinds of unique hidden layer structure of convolutional layer and pond layer.Recently as meter
The development of calculation machine hardware (CPU, GPU), the calculating performance of computer are greatly improved.Neutral net mould more complicated CNN etc.
The calculating performance that type relies on computer powerful is increasingly becoming the study hotspot of numerous research fields.There is CNN preferable feature to carry
Ability is taken, it is widely used in the fields such as image, video, voice at present.
CNN has unique network structure.Its uniqueness is mainly reflected in two aspects:It is the next of it on one side
Do not connected entirely between layer neuron and last layer neuron, i.e., be local sensing between its neuron;On the other hand god
Through having identical weight in first connection procedure, i.e. the connection of neuron is that weight is shared.This unique local sensing and
The shared network structure of weight approaches with biological neural network.Such model can effectively reduce the parameter in network, effectively
Reduce the complexity of network.CNN has two kinds of unique hidden layer structures, i.e. convolutional layer and pond layer.A certain layer convolution in CNN
Layer is made up of a variety of convolution kernels, and a convolution kernel is the wave filter of a M*M size, and it is used for extracting in last layer receptive field
Certain local feature of each local location.Pond layer be used for last layer convolution feature carry out dimensionality reduction, concrete operations be by
Last layer convolution feature is divided into multiple N*N region.The characteristic value of average (or maximum) in each region is extracted as dimensionality reduction
Feature afterwards.CNN generally would generally access a softmax points after a series of convolutional layers, pond layer, full articulamentum
Class device, for handling more classification problems.
Recognition with Recurrent Neural Network (Recurrent Neural Network, hereinafter referred to as RNN) is the one of neural network model
Kind, because it has unique memory function structure, it is applied in Machine Translation Model.Neural network model includes defeated
Enter layer, hidden layer, output layer three-decker.In traditional neural network model, such as previously described convolutional neural networks mould
It is connectionless per interior nodes from level to level, the node between each layer is to deposit to output layer from input layer to hidden layer among type
It is as shown in Figure 3 in connection, concrete structure.This traditional neural network model and the function not comprising recall info, as one
It is helpless to need a bit by the problem of information is calculated has been produced.If for example, in short, to predict down
The word of one appearance, need in most cases by above caused vocabulary, for example " I is a basketball movement
Member, I likes to play basketball " so in short, " playing basketball " inside latter sentence is needed by " the basketball movement in last sentence
Member " is inferred to.RNN models information caused by the moment can will be remembered before and be applied to current time calculating process
In, this has benefited from RNN compared to the change that traditional neural network model occurs in structure, and the input of RNN hidden layer is not
The output of current time input layer is only included, also includes the output information of last moment hidden layer, i.e., the node inside hidden layer
There is a connection, concrete structure information is as shown in Figure 4.
GRU is that RNN models are improved, it with can be solved as LSTM models existing for RNN models it is long-term
Dependence Problem, it is less compared to parameter for LSTM models inside its model, more it is not easy over-fitting in training process, and
Training speed is very fast.The specific structures of GRU are as shown in Figure 5:
Reference picture 5, a thresholding r be present, referred to as reset thresholding, it can be expressed as formula:
Wherein σ is activation primitive, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers,
Thresholding input layer weights are reset for t,Thresholding hidden layer weights are reset for t.
Another thresholding z in Fig. 5 be present, referred to as update door, it can be expressed as formula:
Wherein σ is activation primitive, xtIt is the input information of t, ht-1It is the output information of t-1 moment hidden layers,
Thresholding input layer weights are updated for t,Thresholding hidden layer weights are updated for t.
From figure 5 it can be seen that the output information of t hidden layer can be expressed as:
Wherein ztThreshold conditon information, h are updated for tt-1It is the output information of t-1 moment hidden layers,It is t
Hidden layer status information, its specific formula is such as shown in (3.4):
Wherein φ is activation primitive, and ⊙ is that the dot product of matrix operates, WtFor t input layer weights, UtImplied for the t moment
Layer weights, rtThreshold conditon information, h are reset for tt-1It is the output information of t-1 moment hidden layers.
By formula (3.4) it can be seen that when thresholding r is reset close to 0, htLatter close to 0, i.e. t
It is input layer information there was only current time in the hidden state at moment, and last moment hidden layer output information will be ignored.This
Kind setting can allow for hidden layer to abandon to the incoherent information of later point so that the information stayed is more meaningful.It is another
Aspect, renewal thresholding are used to control the choice between last moment hidden layer information and current time hidden state information, passed through
Formula (3.3) is it can be seen that work as ZtWhen taking 1, t hidden layer only exports last moment hidden layer information, works as ZtWhen taking 0, during t
Carve hidden layer and only export t hidden layer status information.The replacement thresholding at each moment and renewal thresholding be it is separate,
Therefore when needing recent information, reset thresholding and be at active state.When needing long-range information, renewal goalkeeper can be in
Active state.
Automated graphics semantic description method provided by the invention, using the full articulamentum feature of certain layer of CNN extractions as GRU
The input of model, the low-level image feature and image, semantic of effective integration image describe high-layer semantic information, and precision is high, and the degree of accuracy is high,
Just reach higher semantic description precision using less parameter, the needs of practical application can be met well.
Embodiment described above only expresses embodiments of the present invention, and its description is more specific and detailed, but can not
Therefore it is interpreted as the limitation to the scope of the claims of the present invention.It should be pointed out that come for one of ordinary skill in the art
Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention
Protect scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (10)
1. a kind of automated graphics semantic description method, it is characterised in that including building and training based on CNN and GRU from cardon
As semantic description model, it is specially:
Step 1) objective function;
Step 2) carries out the process translated from image to semantic description;
Step 3) carries out reverse propagation to error.
2. automated graphics semantic description method according to claim 1, it is characterised in that the object function in step 1) is
3. the automated graphics semantic description method according to claim 1-2, it is characterised in that the step 2) such as following public affairs
Shown in formula:
x-1=CNN (I);
xt=West, t ∈ 0 ... N-1 };
ht=GRU (xt), t ∈ 0 ... N-1 };
pt+1=g (Wpht);
Wherein, I represents piece image, S=(s0, s1, s2... sn) the complete semantic description of diagram picture is represented, it is single by n
Word forms.stUsing one-hot coding form;Wherein s0It is a special words " start ", represents the beginning of a word;snIt is one
Individual special words " end ", represent the end of a word.
4. automated graphics semantic description method according to claim 1, it is characterised in that the step 3 includes:
Define loss function:The loss function is by the correct log of prediction word of all moment
The result of negative, i.e. cross entropy loss function are taken after probable value summation;
By training the parameter constantly updated in model so that penalty values are as far as possible small;
The parameter is updated using stochastic gradient descent method and chain type Rule for derivation.
5. automated graphics semantic description method according to claim 1, it is characterised in that the parameter includes GRU models
Inner parameter, term vector coding parameter, characteristics of image coding parameter, output decoding parametric.
6. the automated graphics semantic description method according to claim 1-5, it is characterised in that in the training process of model
In, the weighting parameter of the GRU networks at each moment is all shared, the output of last moment GRU network, as current time
The part input of GRU networks.
7. the automated graphics semantic description method according to claim 1-6, it is characterised in that CNN includes convolutional layer and pond
Change two kinds of hidden layer structure of layer.
8. the automated graphics semantic description method according to claim 1-7, it is characterised in that CNN next layer of neuron
It is not connected entirely between last layer neuron, i.e., is local sensing between its neuron;Another aspect neuron connects
During there is identical weight, i.e. the connection of neuron is that weight is shared.
9. the automated graphics semantic description method according to claim 1-8, it is characterised in that weight in GRU structure be present
Thresholding is put, it is expressed asWherein σ is activation primitive, xtIt is the input information of t, ht-1When being t-1
The output information of hidden layer is carved,Thresholding input layer weights are reset for t,Thresholding hidden layer weights are reset for t.
10. the automated graphics semantic description method according to claim 1-9, it is characterised in that exist more in GRU structure
New door, it can be expressed as formula:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710969647.9A CN107807971A (en) | 2017-10-18 | 2017-10-18 | A kind of automated graphics semantic description method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710969647.9A CN107807971A (en) | 2017-10-18 | 2017-10-18 | A kind of automated graphics semantic description method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107807971A true CN107807971A (en) | 2018-03-16 |
Family
ID=61585067
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710969647.9A Pending CN107807971A (en) | 2017-10-18 | 2017-10-18 | A kind of automated graphics semantic description method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107807971A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416059A (en) * | 2018-03-22 | 2018-08-17 | 北京市商汤科技开发有限公司 | Training method and device, equipment, medium, the program of image description model |
CN108710902A (en) * | 2018-05-08 | 2018-10-26 | 江苏云立物联科技有限公司 | A kind of sorting technique towards high-resolution remote sensing image based on artificial intelligence |
CN108764299A (en) * | 2018-05-04 | 2018-11-06 | 北京物灵智能科技有限公司 | Story model training and generation method, system, robot and storage device |
CN108830287A (en) * | 2018-04-18 | 2018-11-16 | 哈尔滨理工大学 | The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method |
CN109271926A (en) * | 2018-09-14 | 2019-01-25 | 西安电子科技大学 | Intelligent Radiation source discrimination based on GRU depth convolutional network |
CN109447244A (en) * | 2018-10-11 | 2019-03-08 | 中山大学 | A kind of advertisement recommended method of combination gating cycle unit neural network |
CN109523993A (en) * | 2018-11-02 | 2019-03-26 | 成都三零凯天通信实业有限公司 | A kind of voice languages classification method merging deep neural network with GRU based on CNN |
CN109558838A (en) * | 2018-11-29 | 2019-04-02 | 北京经纬恒润科技有限公司 | A kind of object identification method and system |
CN109710787A (en) * | 2018-12-30 | 2019-05-03 | 陕西师范大学 | Image Description Methods based on deep learning |
CN110232413A (en) * | 2019-05-31 | 2019-09-13 | 华北电力大学(保定) | Insulator image, semantic based on GRU network describes method, system, device |
CN110889430A (en) * | 2019-10-24 | 2020-03-17 | 中国科学院计算技术研究所 | News image detection method, system and device based on multi-domain visual features |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106901723A (en) * | 2017-04-20 | 2017-06-30 | 济南浪潮高新科技投资发展有限公司 | A kind of electrocardiographic abnormality automatic diagnosis method |
US20170255832A1 (en) * | 2016-03-02 | 2017-09-07 | Mitsubishi Electric Research Laboratories, Inc. | Method and System for Detecting Actions in Videos |
CN107229967A (en) * | 2016-08-22 | 2017-10-03 | 北京深鉴智能科技有限公司 | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA |
-
2017
- 2017-10-18 CN CN201710969647.9A patent/CN107807971A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170255832A1 (en) * | 2016-03-02 | 2017-09-07 | Mitsubishi Electric Research Laboratories, Inc. | Method and System for Detecting Actions in Videos |
CN107229967A (en) * | 2016-08-22 | 2017-10-03 | 北京深鉴智能科技有限公司 | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA |
CN106901723A (en) * | 2017-04-20 | 2017-06-30 | 济南浪潮高新科技投资发展有限公司 | A kind of electrocardiographic abnormality automatic diagnosis method |
Non-Patent Citations (1)
Title |
---|
ORIOL VINYALS 等: "Show and Tell: A Neural Image Caption Generator", 《2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) 》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416059A (en) * | 2018-03-22 | 2018-08-17 | 北京市商汤科技开发有限公司 | Training method and device, equipment, medium, the program of image description model |
CN108830287A (en) * | 2018-04-18 | 2018-11-16 | 哈尔滨理工大学 | The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method |
CN108764299B (en) * | 2018-05-04 | 2020-10-23 | 北京物灵智能科技有限公司 | Story model training and generating method and system, robot and storage device |
CN108764299A (en) * | 2018-05-04 | 2018-11-06 | 北京物灵智能科技有限公司 | Story model training and generation method, system, robot and storage device |
CN108710902A (en) * | 2018-05-08 | 2018-10-26 | 江苏云立物联科技有限公司 | A kind of sorting technique towards high-resolution remote sensing image based on artificial intelligence |
CN109271926A (en) * | 2018-09-14 | 2019-01-25 | 西安电子科技大学 | Intelligent Radiation source discrimination based on GRU depth convolutional network |
CN109447244A (en) * | 2018-10-11 | 2019-03-08 | 中山大学 | A kind of advertisement recommended method of combination gating cycle unit neural network |
CN109523993A (en) * | 2018-11-02 | 2019-03-26 | 成都三零凯天通信实业有限公司 | A kind of voice languages classification method merging deep neural network with GRU based on CNN |
CN109523993B (en) * | 2018-11-02 | 2022-02-08 | 深圳市网联安瑞网络科技有限公司 | Voice language classification method based on CNN and GRU fusion deep neural network |
CN109558838A (en) * | 2018-11-29 | 2019-04-02 | 北京经纬恒润科技有限公司 | A kind of object identification method and system |
CN109710787A (en) * | 2018-12-30 | 2019-05-03 | 陕西师范大学 | Image Description Methods based on deep learning |
CN109710787B (en) * | 2018-12-30 | 2023-03-28 | 陕西师范大学 | Image description method based on deep learning |
CN110232413A (en) * | 2019-05-31 | 2019-09-13 | 华北电力大学(保定) | Insulator image, semantic based on GRU network describes method, system, device |
CN110889430A (en) * | 2019-10-24 | 2020-03-17 | 中国科学院计算技术研究所 | News image detection method, system and device based on multi-domain visual features |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107807971A (en) | A kind of automated graphics semantic description method | |
CN111985245B (en) | Relationship extraction method and system based on attention cycle gating graph convolution network | |
CN110609891B (en) | Visual dialog generation method based on context awareness graph neural network | |
Cheng et al. | Facial expression recognition method based on improved VGG convolutional neural network | |
CN108875807B (en) | Image description method based on multiple attention and multiple scales | |
CN110288665B (en) | Image description method based on convolutional neural network, computer-readable storage medium and electronic device | |
CN108830287A (en) | The Chinese image, semantic of Inception network integration multilayer GRU based on residual error connection describes method | |
CN110390397B (en) | Text inclusion recognition method and device | |
CN110134946B (en) | Machine reading understanding method for complex data | |
CN112990296B (en) | Image-text matching model compression and acceleration method and system based on orthogonal similarity distillation | |
CN112163425B (en) | Text entity relation extraction method based on multi-feature information enhancement | |
CN110222163A (en) | A kind of intelligent answer method and system merging CNN and two-way LSTM | |
CN111008293A (en) | Visual question-answering method based on structured semantic representation | |
CN108197294A (en) | A kind of text automatic generation method based on deep learning | |
CN110826338B (en) | Fine-grained semantic similarity recognition method for single-selection gate and inter-class measurement | |
CN111177376A (en) | Chinese text classification method based on BERT and CNN hierarchical connection | |
CN107766320A (en) | A kind of Chinese pronoun resolution method for establishing model and device | |
CN109597876A (en) | A kind of more wheels dialogue answer preference pattern and its method based on intensified learning | |
CN109710744A (en) | A kind of data matching method, device, equipment and storage medium | |
CN112036276A (en) | Artificial intelligent video question-answering method | |
CN111368142A (en) | Video intensive event description method based on generation countermeasure network | |
CN113641819A (en) | Multi-task sparse sharing learning-based argument mining system and method | |
CN111428481A (en) | Entity relation extraction method based on deep learning | |
CN108805260A (en) | A kind of figure says generation method and device | |
CN115511069A (en) | Neural network training method, data processing method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180316 |
|
WD01 | Invention patent application deemed withdrawn after publication |