CN107577983A - It is a kind of to circulate the method for finding region-of-interest identification multi-tag image - Google Patents

It is a kind of to circulate the method for finding region-of-interest identification multi-tag image Download PDF

Info

Publication number
CN107577983A
CN107577983A CN201710562354.9A CN201710562354A CN107577983A CN 107577983 A CN107577983 A CN 107577983A CN 201710562354 A CN201710562354 A CN 201710562354A CN 107577983 A CN107577983 A CN 107577983A
Authority
CN
China
Prior art keywords
region
interest
matrix
tag image
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201710562354.9A
Other languages
Chinese (zh)
Inventor
林倞
王州霞
李冠彬
陈添水
成慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201710562354.9A priority Critical patent/CN107577983A/en
Publication of CN107577983A publication Critical patent/CN107577983A/en
Withdrawn legal-status Critical Current

Links

Abstract

The present invention provides a kind of method for circulating and finding region-of-interest identification multi-tag image, the multi-tag image recognition framework that this method proposes, it is not only unrelated with candidate region, and the different region of the yardstick of semantic correlation can be automatically found in the picture, and these interregional Context-dependents are obtained simultaneously;For spatial alternation network, we also proposed three constraints.They not only facilitate positioning and have more the region of semantic information, and can further improve the accuracy of multi-tag image recognition;The invention is not only effectively improved the identification accuracy of multi-tag image, and largely improves the efficiency of identification.

Description

It is a kind of to circulate the method for finding region-of-interest identification multi-tag image
Technical field
The present invention relates to computer vision, area of pattern recognition, is circulated more particularly, to one kind and finds that region-of-interest is known The method of other multi-tag image.
Background technology
Identification multi-tag image is a common and actual task in computer vision, because the image in real world Generally comprise abundant and various semanteme.And how the main difficult point of this task is effectively by semantic label and image Hold (such as region or subregion) to associate, particularly under the scene of complexity, such as foreground object is scattered and size not Unanimously.
It is used for the method for image multi-tag classification now generally by means of single labeling and object location techniques.And in recent years Proved to be directed to this way to solve the problem, while consider the spatial information of different objects and their global information in image Very big performance boost can be brought.The typical process of existing method includes two steps:1) substantial amounts of candidate region is extracted, and Assuming that these candidate regions contain all foreground objects.2) predict the label of these candidate regions and it is regular be this image Multiple labels.But these methods to generate candidate region dependence normally result in computing redundancy, and can ignore or Excessively simplify the context relation between foreground object.In addition, the method based on the two steps, its training stage is not so Perfection, it is in training stage and test phase combined optimization end to end all difficult to realize.
Research i.e. at present for multi-tag image recognition is primarily present problems with:
1) current research, the generation dependent on candidate region mostly, and the generating algorithm of most of candidate regions, especially It is bottom-up generating algorithm, it is extremely time-consuming.Use candidate region in addition, can ignore or excessively simplify foreground object between it is upper Hereafter relation;
2) current research, it is difficult to realize combined optimization process end to end.
The content of the invention
To provide a kind of method for circulating discovery region-of-interest and identifying multi-tag image, this method can be carried effectively the present invention The high precision of multi-tag image recognition, and largely saved time cost.
In order to reach above-mentioned technique effect, technical scheme is as follows:
It is a kind of to circulate the method for finding region-of-interest identification multi-tag image, comprise the following steps:
S1:The feature representation of sample is extracted using a convolutional neural networks;
S2:Using the transformation matrix of last moment prediction by spatial alternation network in the characteristic pattern that step S1 is obtained section Take concerned region;
S3:By the long mnemon in short-term of region-of-interest input, the unit is according to input information and the hiding shape of last moment State and the hidden state and memory state at memory state generation current time;
S4:The classification scores vector of the region-of-interest is predicted according to the hidden state at current time, and predicts subsequent time Transformation matrix needed for spatial alternation network;
S5:Circulation performs step S2-S4, until the scores vector that kth, fusion 2 are predicted to the K moment, obtains the image Final classification results.
Further, the transformation matrix in the step S2, its form of expression areWherein (sx,sy) table Show scale transformation, (rx,ry) represent that rotation transformation perseverance is zero, (tx,ty) translation transformation is represented, its span is [- 1,1]; Parameter of the spatial alternation network in transformation matrix is zooming and panning conversion, in each passage interception pair of global characteristics figure The one piece of region answered, and be adjusted to fixed size and exported.
Further, the specific implementation process of the spatial alternation network is as follows:
S21:To known target matrix coordinate (xt, yt), wherein -1≤xt≤ 1, -1≤yt≤ 1, ask corresponding in source matrix Coordinate (xs, ys), wherein -1≤xs≤1,-1≤ys≤ 1, formula is
S22:By the coordinate (x that value is [- 1,1]s, ys) coordinate of original matrix is mapped back, formula is(xt, yt) similarly obtain (Xt, Yt), wherein source matrix MsWith objective matrix Mt Size is respectively (Hs, Ws), (Ht, Wt);
S23:Pass through the method coordinates computed (X of linear interpolations, Ys) value, as objective matrix MtCoordinate (Xt, Yt) Value.
Wherein, the convolutional neural networks for extracting feature are derived from VGG (Simonyan ICLR2015) except pool5 and institute thereafter There is layer, the model that its parameter trains to obtain with single labeling by the class of mass data ImageNet data sets 1000 is carried out initially Change.
Sample is adjusted to N*N (the method N is 512 and 640 two yardsticks) size by original image in step S1, and intercepts Wherein (N-64) * (N-64) size is as input.In addition, the sample of interception can be overturn (in training process at random with 0.5 probability Random interception, random upset, fix when test four corners that interception N*N sizes are (N-64) * (N-64) and Central area, and overturn).
Further, affiliated step S4 sorter network is made up of one layer of full articulamentum, and it is the hidden of current time that it, which is inputted, Tibetan state, output are the vectors that length is C, and wherein C represents the classification number of data set, and the vector is to current region-of-interest category In the marking situation of each classification.
Further, the positioning network of the step S4 is made up of one layer of full articulamentum, and it is the hidden of current time that it, which is inputted, Tibetan state, the vector for being 4 for length is exported, represents s respectivelyx, tx, sy, ty, i.e., scale transformation and translation transformation only are included, are pair The prediction of transformation matrix corresponding to next region-of-interest.
Further, mixing operation is for each classification in the step S5, and its final score is selected from all concerns Such fraction highest region-of-interest in region, it is specific to represent as follows:Remember that scores vector corresponding to the 2-K moment is { s2, s3,···,sk, whereinFinal score vector after note fusion is s { s1,s2,…,sC, then,C=1,2 ... C.
Compared with prior art, the beneficial effect of technical solution of the present invention is:
Multi-tag image recognition framework proposed by the present invention, it is not only unrelated with candidate region, and can automatically scheme The different region of semantic related yardstick is found as in, and obtains these interregional Context-dependents simultaneously;Become for space Switching network, we also proposed three constraints.They not only facilitate positioning and have more the region of semantic information, and can enter one Step improves the accuracy of multi-tag image recognition;The invention is not only effectively improved the identification accuracy of multi-tag image, and And largely improve the efficiency of identification.
Brief description of the drawings
The training of Fig. 1 models of the present invention and test basic framework figure.
Fig. 2 hollow converting network of the present invention constrains 1 schematic diagram.
Embodiment
Accompanying drawing being given for example only property explanation, it is impossible to be interpreted as the limitation to this patent;
In order to more preferably illustrate the present embodiment, some parts of accompanying drawing have omission, zoomed in or out, and do not represent actual product Size;
To those skilled in the art, it is to be appreciated that some known features and its explanation, which may be omitted, in accompanying drawing 's.
Technical scheme is described further with reference to the accompanying drawings and examples.
Embodiment 1
As shown in figure 1, a kind of circulate the method for finding region-of-interest identification multi-tag image, comprise the following steps:
S1:The feature representation of sample is extracted using a convolutional neural networks;
S2:Using the transformation matrix of last moment prediction by spatial alternation network in the characteristic pattern that step S1 is obtained section Take concerned region;
S3:By the long mnemon in short-term of region-of-interest input, the unit is according to input information and the hiding shape of last moment State and the hidden state and memory state at memory state generation current time;
S4:The classification scores vector of the region-of-interest is predicted according to the hidden state at current time, and predicts subsequent time Transformation matrix needed for spatial alternation network;
S5:Circulation performs step S2-S4, until the scores vector that kth, fusion 2 are predicted to the K moment, obtains the image Final classification results.
Transformation matrix in step S2, its form of expression areWherein (sx,sy) represent scale transformation, (rx, ry) represent that rotation transformation perseverance is zero, (tx,ty) translation transformation is represented, its span is [- 1,1];Spatial alternation network root It is that zooming and panning convert according to the parameter in transformation matrix, one piece of region corresponding to each passage interception in global characteristics figure, And it is adjusted to fixed size and is exported.
The specific implementation process of spatial alternation network is as follows:
S21:To known target matrix coordinate (xt, yt), wherein -1≤xt≤ 1, -1≤yt≤ 1, ask corresponding in source matrix Coordinate (xs, ys), wherein -1≤xs≤1,-1≤ys≤ 1, formula is
S22:By the coordinate (x that value is [- 1,1]s, ys) coordinate of original matrix is mapped back, formula is(xt, yt) similarly obtain (Xt, Yt), wherein source matrix MsWith objective matrix MtIt is big Small is respectively (Hs, Ws), (Ht, Wt);
S23:Pass through the method coordinates computed (X of linear interpolations, Ys) value, as objective matrix MtCoordinate (Xt, Yt) Value.
Wherein, the convolutional neural networks for extracting feature are derived from VGG (Simonyan ICLR2015) except pool5 and institute thereafter There is layer, the model that its parameter trains to obtain with single labeling by the class of mass data ImageNet data sets 1000 is carried out initially Change.
Sample is adjusted to N*N (the method N is 512 and 640 two yardsticks) size by original image in step S1, and intercepts Wherein (N-64) * (N-64) size is as input.In addition, the sample of interception can be overturn (in training process at random with 0.5 probability Random interception, random upset, fix when test four corners that interception N*N sizes are (N-64) * (N-64) and Central area, and overturn).
Step S4 sorter network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, and output is Length is C vector, and wherein C represents the classification number of data set, and the vector is to belong to each classification to current region-of-interest Marking situation.
Step S4 positioning network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, exports and is Length is 4 vector, represents s respectivelyx, tx, sy, ty, i.e., scale transformation and translation transformation only are included, are to next region-of-interest The prediction of corresponding transformation matrix.
Mixing operation is its final score such fraction in all region-of-interests for each classification in step S5 Highest region-of-interest, it is specific to represent as follows:Remember that scores vector corresponding to the 2-K moment is { s2,s3,···,sk, whereinFinal score vector after note fusion is s { s1,s2,…,sC, then,C=1,2 ... C.
Technical scheme is further elaborated with reference to specific technical scheme.
1. data processing:
A), the training stage:The size of all training samples is uniformly adjusted to N × N (N takes 512 and 640 in the invention), with The block of machine interception wherein (N-64) × (N-64) sizes, inputted using 0.5 probability Random Level upset as final sample.
B), test phase:The size of all test samples is uniformly adjusted to N × N (N takes 512 and 640 in the invention), Its four corners and the middle block for intercepting (N-64) × (N-64) sizes respectively, using the block of these blocks and its flip horizontal as sample This input.Therefore each sample of test process shares 10 interception blocks, also imply that and have 10 classification results outputs.Our meetings The result final as the test sample to this 10 result averageds.
2. convolutional neural networks:For extracting the feature representation of sample, by 13 layers of convolutional layer (Convolutional Layer) form, wherein being interspersed with pond layer (Max-pooling Layer) and the linear elementary layer (ReLU of correction Nonlinearity Layer).Its initiation parameter is the model parameter trained by mass data ImageNet data sets.
3. spatial alternation network:Its specific algorithm is described in detail in step S21-S23.It should be noted that the invention In middle embodiment, the output size of spatial alternation network is 7x7.
4. grow memory network in short-term:The network structure includes input gate it, out gate otWith forgetting door ft, specific algorithm is such as Under:
it=σ (vixxt+wimmt-1+bi)
ft=σ (wfxxt+wfmmt-1+bf)
ct=ft⊙ct-1+it⊙g(wcxxt+wcmmt-1+bc)
ot=σ (woxxt+wommt-1+bo)
mt=ot⊙h(ct)
Wherein w..., b... represent weight and deviant, c respectivelyt, mtThe memory state at current time and hidden is represented respectively Tibetan state.σ represents sigmoid functions, and g and h are generally tanh functions.⊙ represents that the element of two vectorial correspondence positions is carried out Dot product.
In addition, hidden state and memory state are the vector that size is 2048.
5. sorter network:It is made up of one layer of full articulamentum, it inputs the hidden state for current time, and output is that size is C vector, C represent the classification number of data set.Its initiation parameter is Gauss number.
6. positioning network, it is made up of one layer of full articulamentum, it inputs the hidden state for current time, and output is that size is 4 vector, is sx, tx, sy, ty respectively, i.e., scale transformation and translation transformation in transformation matrix.Its initiation parameter is Gauss Random number.
7. fusion.
8. grader loss function:The invention is using Euclidean distance algorithm.Assuming that number of training is N, Mei Gexun Practice sample xiCorresponding label vector isC represents the number of categories of datasets.If the sample is marked There is classification c in note, thenOtherwiseAnd label probability vector representation isGive Surely the probability vector predicted is vectorial pi, then
So, final grader loss function can be expressed as:
9.3 constraints on spatial alternation network:
A), anchor constraint (Anchor constraint):The invention can navigate to distribution as much as possible in order that obtaining model In the object of image diverse location, redundancy is reduced, it is proposed that anchor constrains, such as Fig. 2.Set the model one and share K+1 moment, then K region-of-interest can be positioned, wherein first region-of-interest (Bluepoint) will not be given constraint, and ensuing K-1 is paid close attention to The positioning study in region, the invention provides anchor as depicted (red point) so that positioning study has certain guiding.Its formula It is expressed as follows:
WhereinPrediction of the k moment to translation transformation in transformation matrix is represented, andIt is then its corresponding anchor.
B), dimensional constraints (Scale constraint):During scale transformation in predictive transformation matrix, in order to not give The yardstick put is excessive (excessive yardstick means that region-of-interest is intended to full figure), and the invention proposes dimensional constraints, even in advance Scale transformation the parameter sx and sy of survey are more than parameter alpha (α=0.5 in the embodiment), then give and punish.Its formula is expressed as follows
C), just constraint (Positive constraint):If the scale transformation parameter in transformation matrix is negative, cut The characteristic pattern taken can be overturn or turned upside down by left and right.The decline on recognition performance is brought in order to avoid this operation, the hair Bright to propose positive constraint, scale transformation the parameter sx and sy even predicted is less than parameter beta (β=0.1 in the embodiment), then gives Punishment.Its formula is expressed as follows:
lP=max (0, β-sx)+max (0, β-sy)
To sum up, positioning loss function can be expressed as:
Lloc=ls1lA2lP
Wherein λ 1 and λ 2 is hyper parameter (λ in the embodiment1=0.01, λ2=0.1).
10. the total losses function of the invention model:
L=Lcls+γLloc
Wherein γ is hyper parameter (γ=0.1 in the embodiment).
Same or analogous label corresponds to same or analogous part;
Position relationship is used for being given for example only property explanation described in accompanying drawing, it is impossible to is interpreted as the limitation to this patent;
Obviously, the above embodiment of the present invention is only intended to clearly illustrate example of the present invention, and is not pair The restriction of embodiments of the present invention.For those of ordinary skill in the field, may be used also on the basis of the above description To make other changes in different forms.There is no necessity and possibility to exhaust all the enbodiments.It is all this All any modification, equivalent and improvement made within the spirit and principle of invention etc., should be included in the claims in the present invention Protection domain within.

Claims (6)

1. a kind of circulate the method for finding region-of-interest identification multi-tag image, it is characterised in that comprises the following steps:
S1:The feature representation of sample is extracted using a convolutional neural networks;
S2:Intercepted using the transformation matrix of last moment prediction in the characteristic pattern that step S1 is obtained by spatial alternation network by The region of concern;
S3:By the long mnemon in short-term of region-of-interest input, the unit according to input information and the hidden state of last moment and Memory state generates the hidden state and memory state at current time;
S4:The classification scores vector of the region-of-interest is predicted according to the hidden state at current time, and predicts subsequent time space Transformation matrix needed for converting network;
S5:Circulation performs step S2-S4, and until the scores vector that kth, fusion 2 are predicted to the K moment, it is final to obtain the image Classification results.
2. the method that circulation according to claim 1 finds region-of-interest identification multi-tag image, it is characterised in that described Transformation matrix in step S2, its form of expression areWherein (sx,sy) represent scale transformation, (rx,ry) represent Rotation transformation perseverance is zero, (tx,ty) translation transformation is represented, its span is [- 1,1];Spatial alternation network is according to conversion square Parameter in battle array is that zooming and panning convert, one piece of region corresponding to each passage interception in global characteristics figure, and is adjusted to Fixed size is exported.
3. the method that circulation according to claim 2 finds region-of-interest identification multi-tag image, it is characterised in that described The specific implementation process of spatial alternation network is as follows:
S21:To known target matrix coordinate (xt, yt), wherein -1≤xt≤ 1, -1≤yt≤ 1, seek corresponding coordinate in source matrix (xs, ys), wherein -1≤xs≤1,-1≤ys≤ 1, formula is
S22:By the coordinate (x that value is [- 1,1]s, ys) coordinate of original matrix is mapped back, formula is(xt, yt) similarly obtain (Xt, Yt), wherein source matrix MsWith objective matrix Mt Size is respectively (Hs, Ws), (Ht, Wt);
S23:Pass through the method coordinates computed (X of linear interpolations, Ys) value, as objective matrix MtCoordinate (Xt, Yt) value.
4. the method that circulation according to claim 3 finds region-of-interest identification multi-tag image, it is characterised in that affiliated Step S4 sorter network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, and output is that length is C Vector, wherein C represent the classification number of data set, and the vector is the marking feelings for belonging to each classification to current region-of-interest Condition.
5. the method that circulation according to claim 4 finds region-of-interest identification multi-tag image, it is characterised in that described Step S4 positioning network is made up of one layer of full articulamentum, and it inputs the hidden state for current time, and it is 4 to export as length Vector, s is represented respectivelyx, tx, sy, ty, i.e., scale transformation and translation transformation only are included, are to conversion corresponding to next region-of-interest The prediction of matrix.
6. the method that circulation according to claim 5 finds region-of-interest identification multi-tag image, it is characterised in that described Mixing operation is for each classification in step S5, and its final score such fraction highest in all region-of-interests is closed Region is noted, it is specific to represent as follows:Remember that scores vector corresponding to the 2-K moment is { s2,s3,···,sk, whereinFinal score vector after note fusion is s={ s1,s2,…,sC, then,
CN201710562354.9A 2017-07-11 2017-07-11 It is a kind of to circulate the method for finding region-of-interest identification multi-tag image Withdrawn CN107577983A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710562354.9A CN107577983A (en) 2017-07-11 2017-07-11 It is a kind of to circulate the method for finding region-of-interest identification multi-tag image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710562354.9A CN107577983A (en) 2017-07-11 2017-07-11 It is a kind of to circulate the method for finding region-of-interest identification multi-tag image

Publications (1)

Publication Number Publication Date
CN107577983A true CN107577983A (en) 2018-01-12

Family

ID=61049712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710562354.9A Withdrawn CN107577983A (en) 2017-07-11 2017-07-11 It is a kind of to circulate the method for finding region-of-interest identification multi-tag image

Country Status (1)

Country Link
CN (1) CN107577983A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108415906A (en) * 2018-03-28 2018-08-17 中译语通科技股份有限公司 Based on field automatic identification chapter machine translation method, machine translation system
CN108596206A (en) * 2018-03-21 2018-09-28 杭州电子科技大学 Texture image classification method based on multiple dimensioned multi-direction spatial coherence modeling
CN109086742A (en) * 2018-08-27 2018-12-25 Oppo广东移动通信有限公司 scene recognition method, scene recognition device and mobile terminal
CN110084356A (en) * 2018-01-26 2019-08-02 北京深鉴智能科技有限公司 A kind of deep neural network data processing method and device
CN110210572A (en) * 2019-06-10 2019-09-06 腾讯科技(深圳)有限公司 Image classification method, device, storage medium and equipment
CN112308115A (en) * 2020-09-25 2021-02-02 安徽工业大学 Multi-label image deep learning classification method and equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740402A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Method and device for acquiring semantic labels of digital images

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740402A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Method and device for acquiring semantic labels of digital images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ARTSIOM ABLAVATSKI, SHIJIAN LU, JIANFEI CAI: "Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition", 《IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION》 *
MAX JADERBERG,KAREN SIMONYAN,ANDREW ZISSERMAN, KORAY KAVUKCUOGLU: "Spatial Transformer Networks", 《IN CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084356A (en) * 2018-01-26 2019-08-02 北京深鉴智能科技有限公司 A kind of deep neural network data processing method and device
CN110084356B (en) * 2018-01-26 2021-02-02 赛灵思电子科技(北京)有限公司 Deep neural network data processing method and device
CN108596206A (en) * 2018-03-21 2018-09-28 杭州电子科技大学 Texture image classification method based on multiple dimensioned multi-direction spatial coherence modeling
CN108415906A (en) * 2018-03-28 2018-08-17 中译语通科技股份有限公司 Based on field automatic identification chapter machine translation method, machine translation system
CN108415906B (en) * 2018-03-28 2021-08-17 中译语通科技股份有限公司 Automatic identification discourse machine translation method and machine translation system based on field
CN109086742A (en) * 2018-08-27 2018-12-25 Oppo广东移动通信有限公司 scene recognition method, scene recognition device and mobile terminal
CN110210572A (en) * 2019-06-10 2019-09-06 腾讯科技(深圳)有限公司 Image classification method, device, storage medium and equipment
CN110210572B (en) * 2019-06-10 2023-02-07 腾讯科技(深圳)有限公司 Image classification method, device, storage medium and equipment
CN112308115A (en) * 2020-09-25 2021-02-02 安徽工业大学 Multi-label image deep learning classification method and equipment
CN112308115B (en) * 2020-09-25 2023-05-26 安徽工业大学 Multi-label image deep learning classification method and equipment

Similar Documents

Publication Publication Date Title
CN107577983A (en) It is a kind of to circulate the method for finding region-of-interest identification multi-tag image
Zhang et al. C2FDA: Coarse-to-fine domain adaptation for traffic object detection
CN110532920B (en) Face recognition method for small-quantity data set based on FaceNet method
CN109919934B (en) Liquid crystal panel defect detection method based on multi-source domain deep transfer learning
Liu et al. Panoptic feature fusion net: a novel instance segmentation paradigm for biomedical and biological images
CN110334584B (en) Gesture recognition method based on regional full convolution network
CN108537168A (en) Human facial expression recognition method based on transfer learning technology
Li et al. Object detection based on deep learning of small samples
Liang et al. Comparison detector for cervical cell/clumps detection in the limited data scenario
Fu et al. Maritime ship targets recognition with deep learning
Yu et al. Exemplar-based recursive instance segmentation with application to plant image analysis
CN110458132A (en) One kind is based on random length text recognition method end to end
Zhang et al. An improved discriminative model prediction approach to real-time tracking of objects with camera as sensors
Sahu et al. Dynamic routing using inter capsule routing protocol between capsules
Tao et al. Indoor 3D semantic robot VSLAM based on mask regional convolutional neural network
Tang et al. Pest-YOLO: Deep image mining and multi-feature fusion for real-time agriculture pest detection
Vahadane et al. Dual encoder attention u-net for nuclei segmentation
CN109255382A (en) For the nerve network system of picture match positioning, method and device
Lv et al. Contour deformation network for instance segmentation
CN111368637B (en) Transfer robot target identification method based on multi-mask convolutional neural network
Di et al. Context receptive field and adaptive feature fusion for fabric defect detection
Jia et al. Nuclei instance segmentation and classification in histopathological images using a DT-Yolact
NL2025775B1 (en) Data processing system for acquiring tumor position and contour in ct image and electronic equipment
Ma et al. Depth-guided progressive network for object detection
Zhang et al. Object detection based on deep learning and b-spline level set in color images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20180112