CN107577983A - It is a kind of to circulate the method for finding region-of-interest identification multi-tag image - Google Patents
It is a kind of to circulate the method for finding region-of-interest identification multi-tag image Download PDFInfo
- Publication number
- CN107577983A CN107577983A CN201710562354.9A CN201710562354A CN107577983A CN 107577983 A CN107577983 A CN 107577983A CN 201710562354 A CN201710562354 A CN 201710562354A CN 107577983 A CN107577983 A CN 107577983A
- Authority
- CN
- China
- Prior art keywords
- region
- interest
- matrix
- tag image
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Abstract
The present invention provides a kind of method for circulating and finding region-of-interest identification multi-tag image, the multi-tag image recognition framework that this method proposes, it is not only unrelated with candidate region, and the different region of the yardstick of semantic correlation can be automatically found in the picture, and these interregional Context-dependents are obtained simultaneously;For spatial alternation network, we also proposed three constraints.They not only facilitate positioning and have more the region of semantic information, and can further improve the accuracy of multi-tag image recognition;The invention is not only effectively improved the identification accuracy of multi-tag image, and largely improves the efficiency of identification.
Description
Technical field
The present invention relates to computer vision, area of pattern recognition, is circulated more particularly, to one kind and finds that region-of-interest is known
The method of other multi-tag image.
Background technology
Identification multi-tag image is a common and actual task in computer vision, because the image in real world
Generally comprise abundant and various semanteme.And how the main difficult point of this task is effectively by semantic label and image
Hold (such as region or subregion) to associate, particularly under the scene of complexity, such as foreground object is scattered and size not
Unanimously.
It is used for the method for image multi-tag classification now generally by means of single labeling and object location techniques.And in recent years
Proved to be directed to this way to solve the problem, while consider the spatial information of different objects and their global information in image
Very big performance boost can be brought.The typical process of existing method includes two steps:1) substantial amounts of candidate region is extracted, and
Assuming that these candidate regions contain all foreground objects.2) predict the label of these candidate regions and it is regular be this image
Multiple labels.But these methods to generate candidate region dependence normally result in computing redundancy, and can ignore or
Excessively simplify the context relation between foreground object.In addition, the method based on the two steps, its training stage is not so
Perfection, it is in training stage and test phase combined optimization end to end all difficult to realize.
Research i.e. at present for multi-tag image recognition is primarily present problems with:
1) current research, the generation dependent on candidate region mostly, and the generating algorithm of most of candidate regions, especially
It is bottom-up generating algorithm, it is extremely time-consuming.Use candidate region in addition, can ignore or excessively simplify foreground object between it is upper
Hereafter relation;
2) current research, it is difficult to realize combined optimization process end to end.
The content of the invention
To provide a kind of method for circulating discovery region-of-interest and identifying multi-tag image, this method can be carried effectively the present invention
The high precision of multi-tag image recognition, and largely saved time cost.
In order to reach above-mentioned technique effect, technical scheme is as follows:
It is a kind of to circulate the method for finding region-of-interest identification multi-tag image, comprise the following steps:
S1:The feature representation of sample is extracted using a convolutional neural networks;
S2:Using the transformation matrix of last moment prediction by spatial alternation network in the characteristic pattern that step S1 is obtained section
Take concerned region;
S3:By the long mnemon in short-term of region-of-interest input, the unit is according to input information and the hiding shape of last moment
State and the hidden state and memory state at memory state generation current time;
S4:The classification scores vector of the region-of-interest is predicted according to the hidden state at current time, and predicts subsequent time
Transformation matrix needed for spatial alternation network;
S5:Circulation performs step S2-S4, until the scores vector that kth, fusion 2 are predicted to the K moment, obtains the image
Final classification results.
Further, the transformation matrix in the step S2, its form of expression areWherein (sx,sy) table
Show scale transformation, (rx,ry) represent that rotation transformation perseverance is zero, (tx,ty) translation transformation is represented, its span is [- 1,1];
Parameter of the spatial alternation network in transformation matrix is zooming and panning conversion, in each passage interception pair of global characteristics figure
The one piece of region answered, and be adjusted to fixed size and exported.
Further, the specific implementation process of the spatial alternation network is as follows:
S21:To known target matrix coordinate (xt, yt), wherein -1≤xt≤ 1, -1≤yt≤ 1, ask corresponding in source matrix
Coordinate (xs, ys), wherein -1≤xs≤1,-1≤ys≤ 1, formula is
S22:By the coordinate (x that value is [- 1,1]s, ys) coordinate of original matrix is mapped back, formula is(xt, yt) similarly obtain (Xt, Yt), wherein source matrix MsWith objective matrix Mt
Size is respectively (Hs, Ws), (Ht, Wt);
S23:Pass through the method coordinates computed (X of linear interpolations, Ys) value, as objective matrix MtCoordinate (Xt, Yt)
Value.
Wherein, the convolutional neural networks for extracting feature are derived from VGG (Simonyan ICLR2015) except pool5 and institute thereafter
There is layer, the model that its parameter trains to obtain with single labeling by the class of mass data ImageNet data sets 1000 is carried out initially
Change.
Sample is adjusted to N*N (the method N is 512 and 640 two yardsticks) size by original image in step S1, and intercepts
Wherein (N-64) * (N-64) size is as input.In addition, the sample of interception can be overturn (in training process at random with 0.5 probability
Random interception, random upset, fix when test four corners that interception N*N sizes are (N-64) * (N-64) and
Central area, and overturn).
Further, affiliated step S4 sorter network is made up of one layer of full articulamentum, and it is the hidden of current time that it, which is inputted,
Tibetan state, output are the vectors that length is C, and wherein C represents the classification number of data set, and the vector is to current region-of-interest category
In the marking situation of each classification.
Further, the positioning network of the step S4 is made up of one layer of full articulamentum, and it is the hidden of current time that it, which is inputted,
Tibetan state, the vector for being 4 for length is exported, represents s respectivelyx, tx, sy, ty, i.e., scale transformation and translation transformation only are included, are pair
The prediction of transformation matrix corresponding to next region-of-interest.
Further, mixing operation is for each classification in the step S5, and its final score is selected from all concerns
Such fraction highest region-of-interest in region, it is specific to represent as follows:Remember that scores vector corresponding to the 2-K moment is { s2,
s3,···,sk, whereinFinal score vector after note fusion is s { s1,s2,…,sC, then,C=1,2 ... C.
Compared with prior art, the beneficial effect of technical solution of the present invention is:
Multi-tag image recognition framework proposed by the present invention, it is not only unrelated with candidate region, and can automatically scheme
The different region of semantic related yardstick is found as in, and obtains these interregional Context-dependents simultaneously;Become for space
Switching network, we also proposed three constraints.They not only facilitate positioning and have more the region of semantic information, and can enter one
Step improves the accuracy of multi-tag image recognition;The invention is not only effectively improved the identification accuracy of multi-tag image, and
And largely improve the efficiency of identification.
Brief description of the drawings
The training of Fig. 1 models of the present invention and test basic framework figure.
Fig. 2 hollow converting network of the present invention constrains 1 schematic diagram.
Embodiment
Accompanying drawing being given for example only property explanation, it is impossible to be interpreted as the limitation to this patent;
In order to more preferably illustrate the present embodiment, some parts of accompanying drawing have omission, zoomed in or out, and do not represent actual product
Size;
To those skilled in the art, it is to be appreciated that some known features and its explanation, which may be omitted, in accompanying drawing
's.
Technical scheme is described further with reference to the accompanying drawings and examples.
Embodiment 1
As shown in figure 1, a kind of circulate the method for finding region-of-interest identification multi-tag image, comprise the following steps:
S1:The feature representation of sample is extracted using a convolutional neural networks;
S2:Using the transformation matrix of last moment prediction by spatial alternation network in the characteristic pattern that step S1 is obtained section
Take concerned region;
S3:By the long mnemon in short-term of region-of-interest input, the unit is according to input information and the hiding shape of last moment
State and the hidden state and memory state at memory state generation current time;
S4:The classification scores vector of the region-of-interest is predicted according to the hidden state at current time, and predicts subsequent time
Transformation matrix needed for spatial alternation network;
S5:Circulation performs step S2-S4, until the scores vector that kth, fusion 2 are predicted to the K moment, obtains the image
Final classification results.
Transformation matrix in step S2, its form of expression areWherein (sx,sy) represent scale transformation, (rx,
ry) represent that rotation transformation perseverance is zero, (tx,ty) translation transformation is represented, its span is [- 1,1];Spatial alternation network root
It is that zooming and panning convert according to the parameter in transformation matrix, one piece of region corresponding to each passage interception in global characteristics figure,
And it is adjusted to fixed size and is exported.
The specific implementation process of spatial alternation network is as follows:
S21:To known target matrix coordinate (xt, yt), wherein -1≤xt≤ 1, -1≤yt≤ 1, ask corresponding in source matrix
Coordinate (xs, ys), wherein -1≤xs≤1,-1≤ys≤ 1, formula is
S22:By the coordinate (x that value is [- 1,1]s, ys) coordinate of original matrix is mapped back, formula is(xt, yt) similarly obtain (Xt, Yt), wherein source matrix MsWith objective matrix MtIt is big
Small is respectively (Hs, Ws), (Ht, Wt);
S23:Pass through the method coordinates computed (X of linear interpolations, Ys) value, as objective matrix MtCoordinate (Xt, Yt)
Value.
Wherein, the convolutional neural networks for extracting feature are derived from VGG (Simonyan ICLR2015) except pool5 and institute thereafter
There is layer, the model that its parameter trains to obtain with single labeling by the class of mass data ImageNet data sets 1000 is carried out initially
Change.
Sample is adjusted to N*N (the method N is 512 and 640 two yardsticks) size by original image in step S1, and intercepts
Wherein (N-64) * (N-64) size is as input.In addition, the sample of interception can be overturn (in training process at random with 0.5 probability
Random interception, random upset, fix when test four corners that interception N*N sizes are (N-64) * (N-64) and
Central area, and overturn).
Step S4 sorter network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, and output is
Length is C vector, and wherein C represents the classification number of data set, and the vector is to belong to each classification to current region-of-interest
Marking situation.
Step S4 positioning network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, exports and is
Length is 4 vector, represents s respectivelyx, tx, sy, ty, i.e., scale transformation and translation transformation only are included, are to next region-of-interest
The prediction of corresponding transformation matrix.
Mixing operation is its final score such fraction in all region-of-interests for each classification in step S5
Highest region-of-interest, it is specific to represent as follows:Remember that scores vector corresponding to the 2-K moment is { s2,s3,···,sk, whereinFinal score vector after note fusion is s { s1,s2,…,sC, then,C=1,2 ... C.
Technical scheme is further elaborated with reference to specific technical scheme.
1. data processing:
A), the training stage:The size of all training samples is uniformly adjusted to N × N (N takes 512 and 640 in the invention), with
The block of machine interception wherein (N-64) × (N-64) sizes, inputted using 0.5 probability Random Level upset as final sample.
B), test phase:The size of all test samples is uniformly adjusted to N × N (N takes 512 and 640 in the invention),
Its four corners and the middle block for intercepting (N-64) × (N-64) sizes respectively, using the block of these blocks and its flip horizontal as sample
This input.Therefore each sample of test process shares 10 interception blocks, also imply that and have 10 classification results outputs.Our meetings
The result final as the test sample to this 10 result averageds.
2. convolutional neural networks:For extracting the feature representation of sample, by 13 layers of convolutional layer (Convolutional
Layer) form, wherein being interspersed with pond layer (Max-pooling Layer) and the linear elementary layer (ReLU of correction
Nonlinearity Layer).Its initiation parameter is the model parameter trained by mass data ImageNet data sets.
3. spatial alternation network:Its specific algorithm is described in detail in step S21-S23.It should be noted that the invention
In middle embodiment, the output size of spatial alternation network is 7x7.
4. grow memory network in short-term:The network structure includes input gate it, out gate otWith forgetting door ft, specific algorithm is such as
Under:
it=σ (vixxt+wimmt-1+bi)
ft=σ (wfxxt+wfmmt-1+bf)
ct=ft⊙ct-1+it⊙g(wcxxt+wcmmt-1+bc)
ot=σ (woxxt+wommt-1+bo)
mt=ot⊙h(ct)
Wherein w..., b... represent weight and deviant, c respectivelyt, mtThe memory state at current time and hidden is represented respectively
Tibetan state.σ represents sigmoid functions, and g and h are generally tanh functions.⊙ represents that the element of two vectorial correspondence positions is carried out
Dot product.
In addition, hidden state and memory state are the vector that size is 2048.
5. sorter network:It is made up of one layer of full articulamentum, it inputs the hidden state for current time, and output is that size is
C vector, C represent the classification number of data set.Its initiation parameter is Gauss number.
6. positioning network, it is made up of one layer of full articulamentum, it inputs the hidden state for current time, and output is that size is
4 vector, is sx, tx, sy, ty respectively, i.e., scale transformation and translation transformation in transformation matrix.Its initiation parameter is Gauss
Random number.
7. fusion.
8. grader loss function:The invention is using Euclidean distance algorithm.Assuming that number of training is N, Mei Gexun
Practice sample xiCorresponding label vector isC represents the number of categories of datasets.If the sample is marked
There is classification c in note, thenOtherwiseAnd label probability vector representation isGive
Surely the probability vector predicted is vectorial pi, then
So, final grader loss function can be expressed as:
9.3 constraints on spatial alternation network:
A), anchor constraint (Anchor constraint):The invention can navigate to distribution as much as possible in order that obtaining model
In the object of image diverse location, redundancy is reduced, it is proposed that anchor constrains, such as Fig. 2.Set the model one and share K+1 moment, then
K region-of-interest can be positioned, wherein first region-of-interest (Bluepoint) will not be given constraint, and ensuing K-1 is paid close attention to
The positioning study in region, the invention provides anchor as depicted (red point) so that positioning study has certain guiding.Its formula
It is expressed as follows:
WhereinPrediction of the k moment to translation transformation in transformation matrix is represented, andIt is then its corresponding anchor.
B), dimensional constraints (Scale constraint):During scale transformation in predictive transformation matrix, in order to not give
The yardstick put is excessive (excessive yardstick means that region-of-interest is intended to full figure), and the invention proposes dimensional constraints, even in advance
Scale transformation the parameter sx and sy of survey are more than parameter alpha (α=0.5 in the embodiment), then give and punish.Its formula is expressed as follows
C), just constraint (Positive constraint):If the scale transformation parameter in transformation matrix is negative, cut
The characteristic pattern taken can be overturn or turned upside down by left and right.The decline on recognition performance is brought in order to avoid this operation, the hair
Bright to propose positive constraint, scale transformation the parameter sx and sy even predicted is less than parameter beta (β=0.1 in the embodiment), then gives
Punishment.Its formula is expressed as follows:
lP=max (0, β-sx)+max (0, β-sy)
To sum up, positioning loss function can be expressed as:
Lloc=ls+λ1lA+λ2lP
Wherein λ 1 and λ 2 is hyper parameter (λ in the embodiment1=0.01, λ2=0.1).
10. the total losses function of the invention model:
L=Lcls+γLloc
Wherein γ is hyper parameter (γ=0.1 in the embodiment).
Same or analogous label corresponds to same or analogous part;
Position relationship is used for being given for example only property explanation described in accompanying drawing, it is impossible to is interpreted as the limitation to this patent;
Obviously, the above embodiment of the present invention is only intended to clearly illustrate example of the present invention, and is not pair
The restriction of embodiments of the present invention.For those of ordinary skill in the field, may be used also on the basis of the above description
To make other changes in different forms.There is no necessity and possibility to exhaust all the enbodiments.It is all this
All any modification, equivalent and improvement made within the spirit and principle of invention etc., should be included in the claims in the present invention
Protection domain within.
Claims (6)
1. a kind of circulate the method for finding region-of-interest identification multi-tag image, it is characterised in that comprises the following steps:
S1:The feature representation of sample is extracted using a convolutional neural networks;
S2:Intercepted using the transformation matrix of last moment prediction in the characteristic pattern that step S1 is obtained by spatial alternation network by
The region of concern;
S3:By the long mnemon in short-term of region-of-interest input, the unit according to input information and the hidden state of last moment and
Memory state generates the hidden state and memory state at current time;
S4:The classification scores vector of the region-of-interest is predicted according to the hidden state at current time, and predicts subsequent time space
Transformation matrix needed for converting network;
S5:Circulation performs step S2-S4, and until the scores vector that kth, fusion 2 are predicted to the K moment, it is final to obtain the image
Classification results.
2. the method that circulation according to claim 1 finds region-of-interest identification multi-tag image, it is characterised in that described
Transformation matrix in step S2, its form of expression areWherein (sx,sy) represent scale transformation, (rx,ry) represent
Rotation transformation perseverance is zero, (tx,ty) translation transformation is represented, its span is [- 1,1];Spatial alternation network is according to conversion square
Parameter in battle array is that zooming and panning convert, one piece of region corresponding to each passage interception in global characteristics figure, and is adjusted to
Fixed size is exported.
3. the method that circulation according to claim 2 finds region-of-interest identification multi-tag image, it is characterised in that described
The specific implementation process of spatial alternation network is as follows:
S21:To known target matrix coordinate (xt, yt), wherein -1≤xt≤ 1, -1≤yt≤ 1, seek corresponding coordinate in source matrix
(xs, ys), wherein -1≤xs≤1,-1≤ys≤ 1, formula is
S22:By the coordinate (x that value is [- 1,1]s, ys) coordinate of original matrix is mapped back, formula is(xt, yt) similarly obtain (Xt, Yt), wherein source matrix MsWith objective matrix Mt
Size is respectively (Hs, Ws), (Ht, Wt);
S23:Pass through the method coordinates computed (X of linear interpolations, Ys) value, as objective matrix MtCoordinate (Xt, Yt) value.
4. the method that circulation according to claim 3 finds region-of-interest identification multi-tag image, it is characterised in that affiliated
Step S4 sorter network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, and output is that length is C
Vector, wherein C represent the classification number of data set, and the vector is the marking feelings for belonging to each classification to current region-of-interest
Condition.
5. the method that circulation according to claim 4 finds region-of-interest identification multi-tag image, it is characterised in that described
Step S4 positioning network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, and it is 4 to export as length
Vector, s is represented respectivelyx, tx, sy, ty, i.e., scale transformation and translation transformation only are included, are to conversion corresponding to next region-of-interest
The prediction of matrix.
6. the method that circulation according to claim 5 finds region-of-interest identification multi-tag image, it is characterised in that described
Mixing operation is for each classification in step S5, and its final score such fraction highest in all region-of-interests is closed
Region is noted, it is specific to represent as follows:Remember that scores vector corresponding to the 2-K moment is { s2,s3,···,sk, whereinFinal score vector after note fusion is s={ s1,s2,…,sC, then,
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710562354.9A CN107577983A (en) | 2017-07-11 | 2017-07-11 | It is a kind of to circulate the method for finding region-of-interest identification multi-tag image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710562354.9A CN107577983A (en) | 2017-07-11 | 2017-07-11 | It is a kind of to circulate the method for finding region-of-interest identification multi-tag image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107577983A true CN107577983A (en) | 2018-01-12 |
Family
ID=61049712
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710562354.9A Withdrawn CN107577983A (en) | 2017-07-11 | 2017-07-11 | It is a kind of to circulate the method for finding region-of-interest identification multi-tag image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107577983A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108415906A (en) * | 2018-03-28 | 2018-08-17 | 中译语通科技股份有限公司 | Based on field automatic identification chapter machine translation method, machine translation system |
CN108596206A (en) * | 2018-03-21 | 2018-09-28 | 杭州电子科技大学 | Texture image classification method based on multiple dimensioned multi-direction spatial coherence modeling |
CN109086742A (en) * | 2018-08-27 | 2018-12-25 | Oppo广东移动通信有限公司 | scene recognition method, scene recognition device and mobile terminal |
CN110084356A (en) * | 2018-01-26 | 2019-08-02 | 北京深鉴智能科技有限公司 | A kind of deep neural network data processing method and device |
CN110210572A (en) * | 2019-06-10 | 2019-09-06 | 腾讯科技(深圳)有限公司 | Image classification method, device, storage medium and equipment |
CN112308115A (en) * | 2020-09-25 | 2021-02-02 | 安徽工业大学 | Multi-label image deep learning classification method and equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105740402A (en) * | 2016-01-28 | 2016-07-06 | 百度在线网络技术(北京)有限公司 | Method and device for acquiring semantic labels of digital images |
-
2017
- 2017-07-11 CN CN201710562354.9A patent/CN107577983A/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105740402A (en) * | 2016-01-28 | 2016-07-06 | 百度在线网络技术(北京)有限公司 | Method and device for acquiring semantic labels of digital images |
Non-Patent Citations (2)
Title |
---|
ARTSIOM ABLAVATSKI, SHIJIAN LU, JIANFEI CAI: "Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition", 《IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION》 * |
MAX JADERBERG,KAREN SIMONYAN,ANDREW ZISSERMAN, KORAY KAVUKCUOGLU: "Spatial Transformer Networks", 《IN CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110084356A (en) * | 2018-01-26 | 2019-08-02 | 北京深鉴智能科技有限公司 | A kind of deep neural network data processing method and device |
CN110084356B (en) * | 2018-01-26 | 2021-02-02 | 赛灵思电子科技(北京)有限公司 | Deep neural network data processing method and device |
CN108596206A (en) * | 2018-03-21 | 2018-09-28 | 杭州电子科技大学 | Texture image classification method based on multiple dimensioned multi-direction spatial coherence modeling |
CN108415906A (en) * | 2018-03-28 | 2018-08-17 | 中译语通科技股份有限公司 | Based on field automatic identification chapter machine translation method, machine translation system |
CN108415906B (en) * | 2018-03-28 | 2021-08-17 | 中译语通科技股份有限公司 | Automatic identification discourse machine translation method and machine translation system based on field |
CN109086742A (en) * | 2018-08-27 | 2018-12-25 | Oppo广东移动通信有限公司 | scene recognition method, scene recognition device and mobile terminal |
CN110210572A (en) * | 2019-06-10 | 2019-09-06 | 腾讯科技(深圳)有限公司 | Image classification method, device, storage medium and equipment |
CN110210572B (en) * | 2019-06-10 | 2023-02-07 | 腾讯科技(深圳)有限公司 | Image classification method, device, storage medium and equipment |
CN112308115A (en) * | 2020-09-25 | 2021-02-02 | 安徽工业大学 | Multi-label image deep learning classification method and equipment |
CN112308115B (en) * | 2020-09-25 | 2023-05-26 | 安徽工业大学 | Multi-label image deep learning classification method and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107577983A (en) | It is a kind of to circulate the method for finding region-of-interest identification multi-tag image | |
Zhang et al. | C2FDA: Coarse-to-fine domain adaptation for traffic object detection | |
CN110532920B (en) | Face recognition method for small-quantity data set based on FaceNet method | |
CN109919934B (en) | Liquid crystal panel defect detection method based on multi-source domain deep transfer learning | |
Liu et al. | Panoptic feature fusion net: a novel instance segmentation paradigm for biomedical and biological images | |
CN110334584B (en) | Gesture recognition method based on regional full convolution network | |
CN108537168A (en) | Human facial expression recognition method based on transfer learning technology | |
Li et al. | Object detection based on deep learning of small samples | |
Liang et al. | Comparison detector for cervical cell/clumps detection in the limited data scenario | |
Fu et al. | Maritime ship targets recognition with deep learning | |
Yu et al. | Exemplar-based recursive instance segmentation with application to plant image analysis | |
CN110458132A (en) | One kind is based on random length text recognition method end to end | |
Zhang et al. | An improved discriminative model prediction approach to real-time tracking of objects with camera as sensors | |
Sahu et al. | Dynamic routing using inter capsule routing protocol between capsules | |
Tao et al. | Indoor 3D semantic robot VSLAM based on mask regional convolutional neural network | |
Tang et al. | Pest-YOLO: Deep image mining and multi-feature fusion for real-time agriculture pest detection | |
Vahadane et al. | Dual encoder attention u-net for nuclei segmentation | |
CN109255382A (en) | For the nerve network system of picture match positioning, method and device | |
Lv et al. | Contour deformation network for instance segmentation | |
CN111368637B (en) | Transfer robot target identification method based on multi-mask convolutional neural network | |
Di et al. | Context receptive field and adaptive feature fusion for fabric defect detection | |
Jia et al. | Nuclei instance segmentation and classification in histopathological images using a DT-Yolact | |
NL2025775B1 (en) | Data processing system for acquiring tumor position and contour in ct image and electronic equipment | |
Ma et al. | Depth-guided progressive network for object detection | |
Zhang et al. | Object detection based on deep learning and b-spline level set in color images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20180112 |