CN110110847A - A kind of depth based on attention accelerates the object localization method of intensified learning - Google Patents
A kind of depth based on attention accelerates the object localization method of intensified learning Download PDFInfo
- Publication number
- CN110110847A CN110110847A CN201910362771.8A CN201910362771A CN110110847A CN 110110847 A CN110110847 A CN 110110847A CN 201910362771 A CN201910362771 A CN 201910362771A CN 110110847 A CN110110847 A CN 110110847A
- Authority
- CN
- China
- Prior art keywords
- attention
- network
- learning
- region
- intensified learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a kind of, and the depth based on attention accelerates the object localization method of intensified learning, comprising the following steps: step 1, the input picture into model, the model are divided into two sub-networks, are deeply learning network and attention network respectively;Step 2, model treatment image, it is divided into four-stage: the first stage, it is the training stage of deeply study, under intensified learning frame, target location tasks can be corresponded in three elements, should be accelerated the object localization method of intensified learning based on the depth of attention, and be added to attention network under original deeply learning framework;The data generated using intensified learning training process are trained attention network by this method, with this force vector that gains attention, deeply learning network DQN black box Study on Problems is converted to the whitepack problem for paying attention to force vector, while the control using attention mechanism optimization DQN to location positioning procedure herein.
Description
Technical field
The present invention relates to target location tasks technical field, specially a kind of depth based on attention accelerates intensified learning
Object localization method.
Background technique
Target location tasks are generally decomposed into position and classification two sub-problems, and current main models are learned based on supervision
Under the mode of habit, under the application of deep learning network technology, clarification of objective describes to achieve important breakthrough in performance, but
Still regression problem is taken as to be handled in the determination of the position of target.Deeply learns to position the position of target and make
It is a behaviour control problem to be handled, i.e. the observed region of manipulation is overlapped to determine target position with target area
It sets.Compared with other methods for following certain principle to carry out position positioning, the target positioning based on deeply learning art
Method has higher flexibility and high efficiency, and principle has more interpretation due to class human nature.In the feelings of sample distribution complexity
Under condition, the target location model based on deeply learning art has better generalization ability.
But the characteristic of deeply learning art itself existing defects, required instruction in the stability of target positioning application
Practice that the time is also longer, therefore designing a kind of depth based on attention to accelerate the object localization method of intensified learning is that extremely have must
It wants.
Summary of the invention
The purpose of the present invention is to provide a kind of, and the depth based on attention accelerates the object localization method of intensified learning, with
Solve the problems mentioned above in the background art.
In order to solve the above technical problem, the present invention provides following technical solutions: a kind of depth acceleration based on attention
The object localization method of intensified learning, comprising the following steps:
Step 1, the input picture into model, the model are divided into two sub-networks, are deeply learning network respectively
With attention network;
Step 2, model treatment image, is divided into four-stage:
1) first stage is the training stage of deeply study, and under intensified learning frame, target location tasks can quilt
It corresponds in three elements, i.e. state State, movement Action, income Reward, learning training needed for deeply learns
Be exactly controlling behavior policing parameter π;
State State carries out coding to observed region by depth convolutional neural networks CNNs and generates vector o;
Movement Action includes moving horizontally, vertically moving, scaling variation, the variation of width ratio, position determination;
Income Reward is used to measure the relativeness between observed region b and target actual area g;
IoU (b, g)=area (b ∩ g)/area (b ∪ g),
Reward is represented as Ra (s, s1)=sign (IoU (b1, g)-IoU (b, g));
2) second stage, observed region propagate backward to attention network, to train the parameter of attention vector layer;
3) phase III, the attention network after being trained by meeting threshold value in Reward intercept pass in test image
Infuse region;
4) fourth stage, region-of-interest are transmitted to deeply learning network and promote effect with rapid lock onto target region
Rate.
According to the above technical scheme, in the step 1, deeply learning network DQN refers in intensified learning frame
Under, coding dimensionality reduction is carried out to this high dimensional data of image using depth convolutional neural networks, extracts characteristics of image.
According to the above technical scheme, the step 2 1) in, in target location tasks, State represents observed area
The characteristics of image in domain, Action represent the various control actions of the deformation to observed region, and Reward represents observation area
Correlation between domain and target actual position.
According to the above technical scheme, the step 2 1) in, control strategy π is that controlled to search behavior is two
The neural network of full articulamentum.
According to the above technical scheme, the step 2 1) in, the first stage, which uses, enforcement mechanisms.
According to the above technical scheme, the step 2 2) in,
1) attention network converts H × W × C size feature for image first with depth convolutional neural networks technology
Figure;
2) then we with channel describe sub- p come the spatial information in coding characteristic figure, and expression formula is
3) next in order to which we utilize the weight a in these description informations establishment attention networki=σ (W2f(W1p));
4) next the attention weight in different channels is constructed as paying attention to trying hard to by we Here [tx;ty;Ts]=fCNet (Mi), fCNet () represents cutting function here, will
Notice that the high attention rate region in trying hard to is cut out from input picture to come, for end-to-end operation, we are processed into two
Tie up the form V (x of mask;Y)=VxVy, Vx=f (x-tx+0:5ts)-f (x-tx-0:5ts), Vy=f (y-ty+0:5ts)-
F (y-ty-0:5ts), wherein f (x)=1/ (1+exp (- kx)), and region-of-interest is expressed as x ⊙ Vi, wherein x represents input figure
Picture, i represent the index of regional area.
According to the above technical scheme, the bcIndicate the feature on the C channel;C represents port number, and c represents C
Channel;F () is used as activation primitive, aiFor weight some portion of in association channel;Tx and ty represents region-of-interest center
Transverse and longitudinal coordinate, ts represent the elongated of region-of-interest.
Compared with prior art, the beneficial effects obtained by the present invention are as follows being: extensive chemical should be accelerated based on the depth of attention
The object localization method of habit is added to attention network under original deeply learning framework;This method will be using by force
Chemistry is practised the data that training process generates and is trained to attention network, with this force vector that gains attention, herein will be deep
Degree intensified learning network DQN black box Study on Problems is converted to the whitepack problem for paying attention to force vector, while using attention mechanism
Optimize control of the DQN to location positioning procedure.
Detailed description of the invention
Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention
It applies example to be used to explain the present invention together, not be construed as limiting the invention.In the accompanying drawings:
Fig. 1 is overall flow schematic diagram of the invention;
In figure: 1, full articulamentum;2, pond layer;3, attention vector layer.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Referring to Fig. 1, the present invention provides a kind of technical solution: a kind of depth based on attention accelerates the mesh of intensified learning
Mark localization method, comprising the following steps:
Step 1, the input picture into model, model are divided into two sub-networks, are deeply learning network and note respectively
Meaning power network;
Step 2, model treatment image, is divided into four-stage:
1) first stage is the training stage of deeply study, and under intensified learning frame, target location tasks can quilt
It corresponds in three elements, i.e. state State, movement Action, income Reward, learning training needed for deeply learns
Be exactly controlling behavior policing parameter π;
State State carries out coding to observed region by depth convolutional neural networks CNNs and generates vector o;
Movement Action includes moving horizontally, vertically moving, scaling variation, the variation of width ratio, position determination;
Income Reward is used to measure the relativeness between observed region b and target actual area g;
IoU (b, g)=area (b ∩ g)/area (b ∪ g),
Reward is represented as Ra (s, s1)=sign (IoU (b1, g)-IoU (b, g));
2) second stage, observed region propagate backward to attention network, to train the parameter of attention vector layer;
3) phase III, the attention network after being trained by meeting threshold value in Reward intercept pass in test image
Infuse region;
4) fourth stage, region-of-interest are transmitted to deeply learning network and promote effect with rapid lock onto target region
Rate.
According to the above technical scheme, in step 1, deeply learning network DQN refers under intensified learning frame, benefit
Coding dimensionality reduction is carried out to this high dimensional data of image with depth convolutional neural networks, extracts characteristics of image.
According to the above technical scheme, step 2 1) in, in target location tasks, region observed by State is represented
Characteristics of image, Action represent the various control actions of the deformation to observed region, Reward represent viewing area with
Correlation between target actual position.
According to the above technical scheme, step 2 1) in, control strategy π be search behavior is controlled be two and connect entirely
Connect the neural network of layer.
According to the above technical scheme, step 2 1) in, the first stage, which uses, enforcement mechanisms.
According to the above technical scheme, step 2 2) in,
1) attention network converts H × W × C size feature for image first with depth convolutional neural networks technology
Figure;
2) then we with channel describe sub- p come the spatial information in coding characteristic figure, and expression formula is
3) next in order to which we utilize the weight a in these description informations establishment attention networki=σ (W2f(W1p));
4) next the attention weight in different channels is constructed as paying attention to trying hard to by we Here [tx;ty;Ts]=fCNet (Mi), fCNet () represents cutting function here, will infuse
Meaning try hard in high attention rate region cut out from input picture come, for end-to-end operation, we are processed into two dimension
Form V (the x of mask;Y)=VxVy, Vx=f (x-tx+0:5ts)-f (x-tx-0:5ts), Vy=f (y-ty+0:5ts)-f
(y-ty-0:5ts), wherein f (x)=1/ (1+exp (- kx)), and region-of-interest is expressed as x ⊙ Vi, wherein x represents input figure
Picture, i represent the index of regional area.
According to the above technical scheme, bcIndicate the feature on the C channel;C represents port number, and c represents C and leads to
Road;F () is used as activation primitive, aiFor weight some portion of in association channel;Tx and ty represents the cross at region-of-interest center
Ordinate, ts represent the elongated of region-of-interest.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.
Finally, it should be noted that the foregoing is only a preferred embodiment of the present invention, it is not intended to restrict the invention,
Although the present invention is described in detail referring to the foregoing embodiments, for those skilled in the art, still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features.
All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in of the invention
Within protection scope.
Claims (7)
1. the object localization method that a kind of depth based on attention accelerates intensified learning, comprising the following steps: it is characterized by:
Step 1, the input picture into model, the model are divided into two sub-networks, are deeply learning network and note respectively
Meaning power network;
Step 2, model treatment image, is divided into four-stage:
1) first stage is the training stage of deeply study, and under intensified learning frame, target location tasks can be corresponded to
Into three elements, i.e. state State, movement Action, income Reward, learning training needed for deeply learns is just
It is the policing parameter π of controlling behavior;
State State carries out coding to observed region by depth convolutional neural networks CNNs and generates vector o;
Movement Action includes moving horizontally, vertically moving, scaling variation, the variation of width ratio, position determination;
Income Reward is used to measure the relativeness between observed region b and target actual area g;
IoU (b, g)=area (b ∩ g)/area (b ∪ g),
Reward is represented as Ra (s, s1)=sign (IoU (b1, g)-IoU (b, g));
2) second stage, observed region propagate backward to attention network, to train the parameter of attention vector layer;
3) phase III, the attention network after being trained by meeting threshold value in Reward intercept concern area in test image
Domain;
4) fourth stage, region-of-interest are transmitted to deeply learning network with rapid lock onto target region raising efficiency.
2. a kind of depth based on attention according to claim 1 accelerates the object localization method of intensified learning, special
Sign is: in the step 1, deeply learning network DQN refers under intensified learning frame, utilizes depth convolutional Neural
Network carries out coding dimensionality reduction to this high dimensional data of image, extracts characteristics of image.
3. a kind of depth based on attention according to claim 1 accelerates the object localization method of intensified learning, special
Sign is: the step 2 1) in, in target location tasks, the characteristics of image in region observed by State is represented, Action
The various control actions of the deformation to observed region are represented, Reward is represented between viewing area and target actual position
Correlation.
4. a kind of depth based on attention according to claim 1 accelerates the object localization method of intensified learning, special
Sign is: the step 2 1) in, control strategy π is that is controlled search behavior is the nerve net of two full articulamentums
Network.
5. a kind of depth based on attention according to claim 1 accelerates the object localization method of intensified learning, special
Sign is: the step 2 1) in, the first stage, which uses, enforcement mechanisms.
6. a kind of depth based on attention according to claim 1 accelerates the object localization method of intensified learning, special
Sign is: the step 2 2) in,
1) attention network converts H × W × C size characteristic pattern for image first with depth convolutional neural networks technology;
2) then we with channel describe sub- p come the spatial information in coding characteristic figure, and expression formula is
3) next in order to which we utilize the weight a in these description informations establishment attention networki=σ (W2f(W1p));
4) next the attention weight in different channels is constructed as paying attention to trying hard to Mi (x)=σ by weHere [tx;ty;Ts]=fCNet (Mi), fCNet () represents cutting function here,
By pay attention to try hard in high attention rate region cut out from input picture come, for end-to-end operation, we are processed into
Form V (the x of two-dimentional mask;Y)=VxVy, Vx=f (x-tx+0:5ts)-f (x-tx-0:5ts), Vy=f (y-ty+0:
5ts)-f (y-ty-0:5ts), wherein f (x)=1/ (1+exp (- kx)), and region-of-interest is expressed as x ⊙ Vi, wherein x represents defeated
Enter image, i represents the index of regional area.
7. a kind of depth based on attention according to claim 6 accelerates the object localization method of intensified learning, special
Sign is: the bcIndicate the feature on the C channel;C represents port number, and c represents the C channel;F () is as activation
Function, aiFor weight some portion of in association channel;Tx and ty represents the transverse and longitudinal coordinate at region-of-interest center, and ts represents pass
Infuse the elongated of region.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910362771.8A CN110110847B (en) | 2019-04-30 | 2019-04-30 | Target positioning method for deep accelerated reinforcement learning based on attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910362771.8A CN110110847B (en) | 2019-04-30 | 2019-04-30 | Target positioning method for deep accelerated reinforcement learning based on attention |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110110847A true CN110110847A (en) | 2019-08-09 |
CN110110847B CN110110847B (en) | 2020-02-07 |
Family
ID=67487894
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910362771.8A Expired - Fee Related CN110110847B (en) | 2019-04-30 | 2019-04-30 | Target positioning method for deep accelerated reinforcement learning based on attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110110847B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106373160A (en) * | 2016-08-31 | 2017-02-01 | 清华大学 | Active camera target positioning method based on depth reinforcement learning |
CN107403426A (en) * | 2017-06-20 | 2017-11-28 | 北京工业大学 | A kind of target object detection method and equipment |
CN107832836A (en) * | 2017-11-27 | 2018-03-23 | 清华大学 | Model-free depth enhancing study heuristic approach and device |
CN108304795A (en) * | 2018-01-29 | 2018-07-20 | 清华大学 | Human skeleton Activity recognition method and device based on deeply study |
WO2018184204A1 (en) * | 2017-04-07 | 2018-10-11 | Intel Corporation | Methods and systems for budgeted and simplified training of deep neural networks |
US10241520B2 (en) * | 2016-12-22 | 2019-03-26 | TCL Research America Inc. | System and method for vision-based flight self-stabilization by deep gated recurrent Q-networks |
-
2019
- 2019-04-30 CN CN201910362771.8A patent/CN110110847B/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106373160A (en) * | 2016-08-31 | 2017-02-01 | 清华大学 | Active camera target positioning method based on depth reinforcement learning |
US10241520B2 (en) * | 2016-12-22 | 2019-03-26 | TCL Research America Inc. | System and method for vision-based flight self-stabilization by deep gated recurrent Q-networks |
WO2018184204A1 (en) * | 2017-04-07 | 2018-10-11 | Intel Corporation | Methods and systems for budgeted and simplified training of deep neural networks |
CN107403426A (en) * | 2017-06-20 | 2017-11-28 | 北京工业大学 | A kind of target object detection method and equipment |
CN107832836A (en) * | 2017-11-27 | 2018-03-23 | 清华大学 | Model-free depth enhancing study heuristic approach and device |
CN108304795A (en) * | 2018-01-29 | 2018-07-20 | 清华大学 | Human skeleton Activity recognition method and device based on deeply study |
Non-Patent Citations (1)
Title |
---|
YE HUANG等: "Parallel Search by Reinforcement Learning for Object Detection", 《PRCV 2018:PATTERN RECOGNITION AND COMPUTER VISION》 * |
Also Published As
Publication number | Publication date |
---|---|
CN110110847B (en) | 2020-02-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Park et al. | Saliency map model with adaptive masking based on independent component analysis | |
CN111611749B (en) | Indoor crowd evacuation automatic guiding simulation method and system based on RNN | |
Xiang et al. | Task-oriented deep reinforcement learning for robotic skill acquisition and control | |
Edelstein-Keshet | Mathematical models of swarming and social aggregation | |
CN106227043A (en) | adaptive optimal control method | |
CN103679718A (en) | Fast scenario analysis method based on saliency | |
CN109635763A (en) | A kind of crowd density estimation method | |
Gralka et al. | Convection shapes the trade-off between antibiotic efficacy and the selection for resistance in spatial gradients | |
Kumar et al. | Role of Allee effect on prey–predator model with component Allee effect for predator reproduction | |
CN110110847A (en) | A kind of depth based on attention accelerates the object localization method of intensified learning | |
CN108961270A (en) | A kind of Bridge Crack Image Segmentation Model based on semantic segmentation | |
CN112131693A (en) | Lur' e network clustering synchronization method based on pulse-controlled adaptive control | |
CN103383743B (en) | A kind of chrominance space transformation method | |
CN106021991A (en) | Method for stimulating intervention of tumor cell states based on Boolean network | |
CN115426149A (en) | Single intersection signal lamp control traffic state anti-disturbance generation method based on Jacobian saliency map | |
CN101162482A (en) | Gauss cooperated based on node and semi-particle filtering method | |
Morihiro et al. | Learning grouping and anti-predator behaviors for multi-agent systems | |
CN106355250A (en) | Optimization method and device for judging convert channels based on neural network | |
Calvo-Monge et al. | A nonlinear relapse model with disaggregated contact rates: Analysis of a forward-backward bifurcation | |
Spirov | The change of initial symmetry in the pattern-form interaction model of sea urchin gastrulation | |
Dunn | Hierarchical cellular automata methods | |
Dey et al. | Spatio-temporal dynamics in a diffusive Bazykin model: effects of group defense and prey-taxis | |
Althagafi | Mathematical models of population dynamics in discrete heterogeneous space | |
Kappen | Stimulus-dependent correlations in stochastic networks | |
Zavertanyy et al. | Genotype dynamic for agent neuroevolution in artificial life model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200207 Termination date: 20210430 |