CN109146058A - With the constant ability of transformation and the consistent convolutional neural networks of expression - Google Patents

With the constant ability of transformation and the consistent convolutional neural networks of expression Download PDF

Info

Publication number
CN109146058A
CN109146058A CN201810861718.8A CN201810861718A CN109146058A CN 109146058 A CN109146058 A CN 109146058A CN 201810861718 A CN201810861718 A CN 201810861718A CN 109146058 A CN109146058 A CN 109146058A
Authority
CN
China
Prior art keywords
neural networks
convolutional neural
expression
picture
consistency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810861718.8A
Other languages
Chinese (zh)
Other versions
CN109146058B (en
Inventor
田新梅
何岸峰
沈旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201810861718.8A priority Critical patent/CN109146058B/en
Publication of CN109146058A publication Critical patent/CN109146058A/en
Application granted granted Critical
Publication of CN109146058B publication Critical patent/CN109146058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns

Abstract

Have the invention discloses one kind and convert constant ability and the consistent convolutional neural networks of expression, it is only necessary to which the loss function for introducing invariance during trained may make trained model for transformed picture more robust.Meanwhile in the problem of this method can make model learning to constant expression way is converted, and only learn the mapping to transformation picture to set label compared to traditional method, and this method can preferably move to other deep learnings.In addition, this method is embedded into constant ability is converted in the weighting parameter of network, and is truly to improve the Inalterability of displacement of convolutional neural networks, does not introduce new parameter in model, without doing extra process to picture, do not need to change existing network structure in test.

Description

With the constant ability of transformation and the consistent convolutional neural networks of expression
Technical field
The present invention relates to image classification, the technical fields such as image retrieval, more particularly to it is a kind of have convert constant ability and Express consistent convolutional neural networks.
Background technique
In recent years, with the high speed development of internet, we can touch the picture and video of magnanimity.For these The picture of magnanimity, how accurately to be identified and be retrieved be all picture related applications basis.Past is limited to opposite Insufficient computing capability, can only it is enough it is some relatively low levels feature extraction algorithms, these algorithms are for the high-level of picture Semantic information can not accurately express.With the promotion of computing capability, bring deep learning technology is in image recognition, picture A series of related fieldss such as retrieval made breakthrough progress.Deep learning mainly uses in the applications such as picture recognition retrieval Be convolutional neural networks.It is operated by convolution and pond etc., so that model can be extracted from part to the overall situation layer by layer Feature representation.Compared to traditional method, the technology on high-level semantic it is accurate expression make its on recognition performance Traditional algorithm is greatly surmounted.
However, existing convolutional neural networks are not especially steady for the picture after various spatial alternations.It is right Network middle layer output visualization after it will be seen that when input picture by rotation, scaling or translation it Afterwards, feature representation difference at all levels can be quite big, and therefore, recognition accuracy also can dramatic decrease.
Existing method mainly solves the problems, such as this from three angles: first method is mainly in training to data Collection is enhanced, so that model is adequately learnt on various transformed pictures when training.It is processed in this way After can bring sample is multifarious increases, therefore robustness of the model on various transformed pictures also just obtains immediately It is promoted.Second method is then that various transformed pictures are input to the structure of a multichannel, the spy in each channel It is done in sign mapping output and maximizes pondization operation, pond will be maximized and obtain feature representation of the Feature Mapping as the picture.The Three kinds of methods are to learn the transformation of picture by an additional neural network, and gain picture contravariant according to this transformation Onto the posture of more standard, then classified with the picture of this standard posture.So the effect of picture recognition Promotion can be similarly obtained.
However, for three kinds of above-mentioned methods or increasing trained time or being just the introduction of additional parameter And operation, so that computational complexity increased when identification.Meanwhile if network is increased to transformation by modification structure Robustness, also need to modify existing network structure when application network, be unfavorable for the transplanting of model.
Summary of the invention
Have the object of the present invention is to provide one kind and convert constant ability and the consistent convolutional neural networks of expression, so that net The invariance of feature representation inside network is effectively improved, so that network can be more when identifying to picture It is steady to add.
The purpose of the present invention is what is be achieved through the following technical solutions:
One kind, which has, converts constant ability and the consistent convolutional neural networks of expression, comprising:
Training stage introduces consistency in comprising convolutional layer, full articulamentum and Softmax layers of convolutional neural networks Loss function so that convolutional neural networks study after training is to converting constant expression way;
Wherein, consistency loss function is introduced in convolutional layer to push network to learn the table of consistency on characteristic information It reaches, introduces consistency loss function in full articulamentum and push the network to learn the expression of consistency in semantic information, Softmax layers of introducing consistency loss function push network to learn the expression of consistency in classification information.
As seen from the above technical solution provided by the invention, by successively introduced feature level, semantic hierarchies, and The expression consistency optimization aim of tag along sort level, enables expression of the convolutional Neural network model on these three levels There is robustness to transformation.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the schematic diagram of convolutional neural networks provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram before and after picture provided in an embodiment of the present invention progress basic transformation;
Fig. 3 is the frame provided in an embodiment of the present invention for having and converting constant ability and the consistent convolutional neural networks of expression Figure;
Fig. 4 is the contrast schematic diagram of RC-CNN provided in an embodiment of the present invention and master mould and data enhanced scheme.
Specific embodiment
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, belongs to protection scope of the present invention.
The embodiment of the present invention, which provides one kind, has the constant ability of transformation and the consistent convolutional neural networks (RC-CNN) of expression, Before introducing RC-CNN, it is introduced first against the basic transformation of existing convolutional neural networks (CNNs) and image.
1, convolutional neural networks
Convolutional neural networks (CNNs) are a kind of multi-level deep neural networks.In layers, different by learning Then convolution kernel carries out process of convolution using Feature Mapping of these convolution kernels to preceding layer, obtains as feature extraction operator The Feature Mapping of current layer.For the Feature Mapping of lower level mainly learn be it is some relatively low layers characteristic information, than Such as edge and angle point.With the gradually intensification of level, information expressed by each layer of Feature Mapping is gradually abstracted.It is different Layer in feature representation also represent picture it is at all levels on characteristic information.Shared and spatially the pondization of weight operates It can make convolutional neural networks that there is invariance to some local small spatial alternations.Meanwhile the parameter of model also can be with Reduction.In convolutional neural networks, the operation of convolutional layer can be expressed by following formula:
Wherein, * represents convolution symbol, Xi-1It is (i-1)-th layer of Feature Mapping, Wi jIt is i-th layer of j-th of convolution kernel,It is The amount of bias of i-th layer of j-th of feature representation;Wi jWithIt can learn to obtain by gradient descent algorithm.F () is one non- Linear function, such as ReLU function, Sigmoid function or Tanh function etc..
The operation of full articulamentum and the operation of convolution are essentially the same, in addition to convolution symbol * has been changed to the symbol of matrix multiple ×, following formula:
It is as shown in Figure 1 the schematic diagram of convolutional neural networks (CNNs);It includes convolutional layer (C1~C5), full articulamentum (FC6~FC8) and Softmax layers.
The operation of convolution can carry out the picture of input from low layer to high-rise feature extraction.The operation of full articulamentum can The expression of more high-level semantic level, last full connection are further abstracted into the expression in the feature level by picture FC8 layer of layer export after would generally connect one Softmax layers, his output is network for predicting the confidence level of each classification.
2, the basic transformation of image
In the embodiment of the present invention, the basic transformation of targeted image is mainly some basic spatial alternations, wherein wrapping Rotation is included, is translated, scaling etc..It is assumed that the coordinate of original image is (x, y), it is (x ', y ') by transformed Picture Coordinate. So the transformation of picture can be realized by following formula:
(x ', y ', 1)=(x, y, 1) × T;
Wherein T is the transformation matrix of picture;
The transformation matrix T of rotationRFollowing formula:
Wherein, θ is the angle of rotation.
The transformation matrix T of translationTFollowing formula:
Wherein, dxAnd dyIt is the number for the pixel that picture is translated up in the direction x and the side y respectively.
The transformation matrix T of scalingSFollowing formula:
Wherein, sxAnd syIt is the ratio that picture scales on the direction x and the direction y respectively.
The transformation matrix T that all transformation are all addedRTSIt can be got by three above-mentioned matrix multiples:
TRTS=TR×TT×TS
As shown in Fig. 2, carrying out the example before and after basic transformation for picture;ORI column is original image;R column is Postrotational picture;T column is the picture after translation;S column is the picture after scaling;RTS represents three kinds of transformation simultaneously It has been introduced into picture.
Although convolutional neural networks have invariance for some local, small spatial alternation.But when picture is by complete After office and biggish transformation, convolutional neural networks just not robust.Therefore, one kind provided by the embodiment of the present invention, which has, becomes Change constant ability (i.e. for transformed picture, can still accurately identify, and then realize subsequent classification, search operaqtion) and Express consistent convolutional neural networks, it is only necessary to which the loss function that invariance is introduced during trained may make training Good model is for transformed picture more robust.Meanwhile this method can make model learning to the expression constant to transformation Mode, only learns the mapping to transformation picture to set label compared to traditional method, and this method can be moved to preferably In the problem of other deep learnings.In addition, loss function of this method by introducing consistency, so that constant ability will be converted It is embedded into the weighting parameter of network, is truly to improve the Inalterability of displacement of convolutional neural networks, does not have in model New parameter is introduced, without doing extra process to picture, does not need to change existing network structure in test.
As shown in figure 3, to be a kind of with the frame diagram for converting constant ability and the consistent convolutional neural networks of expression;Training Stage, comprising in convolutional layer, full articulamentum and Softmax layers of convolutional neural networks introduce consistency loss function, So that the convolutional neural networks after training learn to the expression way constant to transformation;
Wherein, consistency loss function is introduced in convolutional layer to push network to learn the table of consistency on characteristic information It reaches;Consistency loss function is introduced in full articulamentum to push network to learn the expression of consistency in semantic information, so that net Network can be consistent as far as possible in the expression of semantic information;Network is pushed in Softmax layers of introducing consistency loss function The expression for learning consistency in classification information, so that the expression as far as possible of classification information is consistent.
Fig. 3 is seen also, in the training stage, the stochastic transformation T ' () of two ways is carried out for the samples pictures X of input " (), obtained transformed picture are denoted as X ' and X " with T;
I-th layer of consistency loss function in convolutional neural networks, is added in the feature representation of picture X ' and X " at i-th layer Feai(X ') and FeaiBetween (X "), indicate are as follows:
In above formula, LiIndicate i-th layer of consistency loss function.
The loss function of entire convolutional neural networks indicates are as follows:
LAllCls×(L′Cls+L″Cls)+∑λi×Li
Wherein, coefficient lambdaiFor weighing i-th layer of consistency loss function Li, L 'ClsWith L "ClsCorrespond respectively to picture X ' With X " Classification Loss, coefficient lambdaClsFor weighing the Classification Loss L of samples pictures XCls, it is assumed that total classification of classification is N, then LClsIt is exactly the loss of the Softmax layer of N output.
In the embodiment of the present invention, above-mentioned i-th layer refers to i-th layer of whole network, and not have to distinguish be specifically convolutional layer, complete Articulamentum or Softmax layers.
In Fig. 3, the T ' (X) and T " (X), which refers to, carries out stochastic transformation T ' () and T to samples pictures X " in left side(·);It is intermediate The label " L_Conv1, L_Conv2 ..., L_FC8 " occurred on a series of arrows upward in part, which respectively indicates, is added in difference Loss function on layer, such as L_Conv1 refer to the loss function on first convolutional layer.The L_Cls presentation class of the rightmost side Loss function.The Ground truth of X of lower section indicates the true classification of samples pictures X.
After completing training through the above way, can obtain has the constant ability of transformation and the consistent convolutional Neural net of expression The good test picture of pre-transform is directly sent into network by network, test phase, can output category result.
As shown in figure 4, enhancing (data for RC-CNN and master mould (original model) and data Augmentation contrast schematic diagram).Wherein, (a) is the distribution of original image Feature Mapping in master mould.(b) after for transformation Picture by data enhancing training after model on Feature Mapping distribution, it is seen that even if by data enhancing, Internal expression also some be aliasing together and be not easy separated.(c) there is the constant energy of transformation to be provided by the invention Power and the consistent convolutional neural networks of expression, the expression mapped by propulsive characteristics is consistent, even so that figure after transformation Piece also can be distinguished more easily.
In order to which RC-CNN provided by the invention makes comparisons with other current the best ways, carried out in two tasks Comparative experiments.One is large-scale picture recognition task, another is picture retrieval task.By RC-CNN respectively with tradition Convolutional neural networks, the enhanced convolutional neural networks of data, the models such as SI-CNN, TI-CNN, ST-CNN compare.
In extensive picture recognition problem, we use the data of ILSVRC-2012.According to figure in the data set The content of piece has been divided into 1000 classes, is a subset of ImageNet.A total of 1.3M picture of training set, verifying collection have altogether There are 50,000 pictures, test set there are 100,000 pictures.The accuracy rate of identification is generally judged by two indices, top-1 accuracy rate and Top-5 accuracy rate.Wherein top-1 represents the highest prediction of confidence level in prediction result and the consistent probability of concrete class.top-5 Represent probability of the actual classification inside first five prediction of confidence level.Contrast and experiment is as shown in 1~table of table 2.
Result (top1/top5) on the ILSVRC-2012 data set of table 1 after the conversion
In above formula comparative experiments, consistency loss function is added in label level (RC-CNN (Cls)), feature representation respectively Level and label level (RC-CNN (Conv+Cls)), semantic hierarchies and label level (RC-CNN (FC+Cls)) and all layers Secondary (RC-CNN (Conv+FC+Cls)).It can be seen that can reach on the whole when consistency loss is added in all levels Optimal result.
Result (top1/top5) of the table 2 on original ILSVRC-2012 data set
From the above result that it can be seen that, it is current best as a result, convolution is effectively promoted in RC-CNN to compare other Invariance of the neural network for transformation.Meanwhile result of the RC-CNN in the data set of original image does not reduce not only, instead And have a certain upgrade, illustrate RC-CNN not and be only over-fitting prediction of the transformed picture to true tag.
In the picture retrieval the problem of, using UK-Bench data set, which is used exclusively for the one of picture retrieval A data set.Include wherein 2550 groups of pictures, every group of picture have 4 pictures, this four picture be from the same article or The different visual angles of person's scene.Task in this data set is exactly to use in data set any one picture in entire data Concentrate remaining three pictures of other in same group of retrieval.In order to verify effect of the RC-CNN in large-scale data, 1,000,000 additional pictures in MIRFlickr data set, which are brought, does negative sample.Use picture classification times above The good model of pre-training in business, for these data, not in re -training or fine tuning.Pictures all in the data set are sent into The model simultaneously extracts the feature representation after its L2 normalization.Then all pictures in one of feature representation and data set are calculated The Euclidean distance of feature representation, and arranged by ascending order.4 nearest pictures of distance are used to calculate NS-Score.NS-Score It is that represent is Average Accuracy apart from immediate four picture.For example: if four pictures all come from just True group, then he can obtain 4.0 score for the figure.Experimental result is as shown in table 3.
Result on 3 UK-Bench data set of table
It can be seen that carrying out RC-CNN can obtain in different tasks in the result on the data set of image retrieval It is obviously improved, illustrates that the present invention has certain transportable ability.
Above scheme provided in an embodiment of the present invention, main thought are by introducing the consistent of three levels in training Property optimization aim come so that network to transformation have certain robustness.By using the optimization method, when the picture of input passes through It crosses after certain transformation, it can be clearly seen that the invariance of the feature representation of network internal is effectively improved, into And enable network more steady when identifying to picture.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can The mode of necessary general hardware platform can also be added to realize by software by software realization.Based on this understanding, The technical solution of above-described embodiment can be embodied in the form of software products, which can store non-easy at one In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Within the technical scope of the present disclosure, any changes or substitutions that can be easily thought of by anyone skilled in the art, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims Subject to enclosing.

Claims (3)

1. one kind, which has, converts constant ability and the consistent convolutional neural networks of expression characterized by comprising
Training stage, in the damage comprising introducing consistency in convolutional layer, full articulamentum and Softmax layers of convolutional neural networks Function is lost, so that the convolutional neural networks after training learn to the expression way constant to transformation;
Wherein, introducing consistency loss function in convolutional layer pushes the network to learn the expression of consistency on characteristic information, Full articulamentum introduces consistency loss function to push network to learn the expression of consistency in semantic information, at Softmax layers Consistency loss function is introduced to push network to learn the expression of consistency in classification information.
2. one kind according to claim 1, which has, converts constant ability and the consistent convolutional neural networks of expression, feature It is,
In the training stage, the samples pictures X of the input stochastic transformation T ' () and T " () for carrying out two ways is obtained Transformed picture is denoted as X ' and X ";
I-th layer of consistency loss function in convolutional neural networks is added in the feature representation Fea of picture X ' and X " at i-th layeri (X ') and FeaiBetween (X "), indicate are as follows:
In above formula, LiIndicate i-th layer of consistency loss function.
3. one kind according to claim 2, which has, converts constant ability and the consistent convolutional neural networks of expression, feature It is, the loss function of entire convolutional neural networks indicates are as follows:
LAllCls×(L′Cls+L″Cls)+∑λi×Li
Wherein, coefficient lambdaiFor weighing i-th layer of consistency loss function Li, L 'ClsWith L "ClsCorrespond respectively to picture X ' and X " Classification Loss, coefficient lambdaClsFor weighing the Classification Loss L of samples pictures XCls
CN201810861718.8A 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression Active CN109146058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810861718.8A CN109146058B (en) 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810861718.8A CN109146058B (en) 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression

Publications (2)

Publication Number Publication Date
CN109146058A true CN109146058A (en) 2019-01-04
CN109146058B CN109146058B (en) 2022-03-01

Family

ID=64799291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810861718.8A Active CN109146058B (en) 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression

Country Status (1)

Country Link
CN (1) CN109146058B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633790A (en) * 2019-09-19 2019-12-31 郑州大学 Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203420A (en) * 2016-07-26 2016-12-07 浙江捷尚视觉科技股份有限公司 A kind of bayonet vehicle color identification method
CN106897714A (en) * 2017-03-23 2017-06-27 北京大学深圳研究生院 A kind of video actions detection method based on convolutional neural networks
CN107145900A (en) * 2017-04-24 2017-09-08 清华大学 Pedestrian based on consistency constraint feature learning recognition methods again
WO2017214968A1 (en) * 2016-06-17 2017-12-21 Nokia Technologies Oy Method and apparatus for convolutional neural networks
US9971940B1 (en) * 2015-08-10 2018-05-15 Google Llc Automatic learning of a video matching system
CN108257115A (en) * 2018-04-13 2018-07-06 中山大学 Image enhancement detection method and system based on orientation consistency convolutional neural networks
CN108280411A (en) * 2018-01-10 2018-07-13 上海交通大学 A kind of pedestrian's searching method with spatial alternation ability

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9971940B1 (en) * 2015-08-10 2018-05-15 Google Llc Automatic learning of a video matching system
WO2017214968A1 (en) * 2016-06-17 2017-12-21 Nokia Technologies Oy Method and apparatus for convolutional neural networks
CN106203420A (en) * 2016-07-26 2016-12-07 浙江捷尚视觉科技股份有限公司 A kind of bayonet vehicle color identification method
CN106897714A (en) * 2017-03-23 2017-06-27 北京大学深圳研究生院 A kind of video actions detection method based on convolutional neural networks
CN107145900A (en) * 2017-04-24 2017-09-08 清华大学 Pedestrian based on consistency constraint feature learning recognition methods again
CN108280411A (en) * 2018-01-10 2018-07-13 上海交通大学 A kind of pedestrian's searching method with spatial alternation ability
CN108257115A (en) * 2018-04-13 2018-07-06 中山大学 Image enhancement detection method and system based on orientation consistency convolutional neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XU SHEN等: "T ransform-Invariant Convolutional Neural Networks for Image Classification and Search", 《ACM》 *
卢官明等: "一种用于人脸表情识别的卷积神经网络", 《南京邮电大学学报(自然科学版)》 *
李洁樱: "基于孪生卷积神经网络的车辆一致性判别方法", 《中国交通信息化》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633790A (en) * 2019-09-19 2019-12-31 郑州大学 Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network
CN110633790B (en) * 2019-09-19 2022-04-08 郑州大学 Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network

Also Published As

Publication number Publication date
CN109146058B (en) 2022-03-01

Similar Documents

Publication Publication Date Title
CN108804530B (en) Subtitling areas of an image
CN107251059A (en) Sparse reasoning module for deep learning
Wu et al. Application of image retrieval based on convolutional neural networks and Hu invariant moment algorithm in computer telecommunications
CN106203483B (en) A kind of zero sample image classification method based on semantic related multi-modal mapping method
Ahmad et al. Data augmentation-assisted deep learning of hand-drawn partially colored sketches for visual search
CN105718940B (en) The zero sample image classification method based on factorial analysis between multiple groups
Magassouba et al. Understanding natural language instructions for fetching daily objects using gan-based multimodal target–source classification
CN109271539A (en) A kind of image automatic annotation method and device based on deep learning
CN110349229A (en) A kind of Image Description Methods and device
Rad et al. Image annotation using multi-view non-negative matrix factorization with different number of basis vectors
CN108154156B (en) Image set classification method and device based on neural topic model
CN113051914A (en) Enterprise hidden label extraction method and device based on multi-feature dynamic portrait
CN104504406B (en) A kind of approximate multiimage matching process rapidly and efficiently
CN111985520A (en) Multi-mode classification method based on graph convolution neural network
Liu et al. Learning a representative and discriminative part model with deep convolutional features for scene recognition
Zhou et al. Sampling-attention deep learning network with transfer learning for large-scale urban point cloud semantic segmentation
CN103617609A (en) A k-means nonlinear manifold clustering and representative point selecting method based on a graph theory
Menaga et al. Deep learning: a recent computing platform for multimedia information retrieval
CN112651940A (en) Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network
Wang et al. A deep clustering via automatic feature embedded learning for human activity recognition
Tavakoli Seq2image: Sequence analysis using visualization and deep convolutional neural network
Pei et al. Unsupervised multimodal feature learning for semantic image segmentation
CN114329031A (en) Fine-grained bird image retrieval method based on graph neural network and deep hash
Wang et al. Multiscale convolutional neural networks with attention for plant species recognition
CN104778272B (en) A kind of picture position method of estimation excavated based on region with space encoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant