CN109146058A - With the constant ability of transformation and the consistent convolutional neural networks of expression - Google Patents
With the constant ability of transformation and the consistent convolutional neural networks of expression Download PDFInfo
- Publication number
- CN109146058A CN109146058A CN201810861718.8A CN201810861718A CN109146058A CN 109146058 A CN109146058 A CN 109146058A CN 201810861718 A CN201810861718 A CN 201810861718A CN 109146058 A CN109146058 A CN 109146058A
- Authority
- CN
- China
- Prior art keywords
- neural networks
- convolutional neural
- expression
- picture
- consistency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
Abstract
Have the invention discloses one kind and convert constant ability and the consistent convolutional neural networks of expression, it is only necessary to which the loss function for introducing invariance during trained may make trained model for transformed picture more robust.Meanwhile in the problem of this method can make model learning to constant expression way is converted, and only learn the mapping to transformation picture to set label compared to traditional method, and this method can preferably move to other deep learnings.In addition, this method is embedded into constant ability is converted in the weighting parameter of network, and is truly to improve the Inalterability of displacement of convolutional neural networks, does not introduce new parameter in model, without doing extra process to picture, do not need to change existing network structure in test.
Description
Technical field
The present invention relates to image classification, the technical fields such as image retrieval, more particularly to it is a kind of have convert constant ability and
Express consistent convolutional neural networks.
Background technique
In recent years, with the high speed development of internet, we can touch the picture and video of magnanimity.For these
The picture of magnanimity, how accurately to be identified and be retrieved be all picture related applications basis.Past is limited to opposite
Insufficient computing capability, can only it is enough it is some relatively low levels feature extraction algorithms, these algorithms are for the high-level of picture
Semantic information can not accurately express.With the promotion of computing capability, bring deep learning technology is in image recognition, picture
A series of related fieldss such as retrieval made breakthrough progress.Deep learning mainly uses in the applications such as picture recognition retrieval
Be convolutional neural networks.It is operated by convolution and pond etc., so that model can be extracted from part to the overall situation layer by layer
Feature representation.Compared to traditional method, the technology on high-level semantic it is accurate expression make its on recognition performance
Traditional algorithm is greatly surmounted.
However, existing convolutional neural networks are not especially steady for the picture after various spatial alternations.It is right
Network middle layer output visualization after it will be seen that when input picture by rotation, scaling or translation it
Afterwards, feature representation difference at all levels can be quite big, and therefore, recognition accuracy also can dramatic decrease.
Existing method mainly solves the problems, such as this from three angles: first method is mainly in training to data
Collection is enhanced, so that model is adequately learnt on various transformed pictures when training.It is processed in this way
After can bring sample is multifarious increases, therefore robustness of the model on various transformed pictures also just obtains immediately
It is promoted.Second method is then that various transformed pictures are input to the structure of a multichannel, the spy in each channel
It is done in sign mapping output and maximizes pondization operation, pond will be maximized and obtain feature representation of the Feature Mapping as the picture.The
Three kinds of methods are to learn the transformation of picture by an additional neural network, and gain picture contravariant according to this transformation
Onto the posture of more standard, then classified with the picture of this standard posture.So the effect of picture recognition
Promotion can be similarly obtained.
However, for three kinds of above-mentioned methods or increasing trained time or being just the introduction of additional parameter
And operation, so that computational complexity increased when identification.Meanwhile if network is increased to transformation by modification structure
Robustness, also need to modify existing network structure when application network, be unfavorable for the transplanting of model.
Summary of the invention
Have the object of the present invention is to provide one kind and convert constant ability and the consistent convolutional neural networks of expression, so that net
The invariance of feature representation inside network is effectively improved, so that network can be more when identifying to picture
It is steady to add.
The purpose of the present invention is what is be achieved through the following technical solutions:
One kind, which has, converts constant ability and the consistent convolutional neural networks of expression, comprising:
Training stage introduces consistency in comprising convolutional layer, full articulamentum and Softmax layers of convolutional neural networks
Loss function so that convolutional neural networks study after training is to converting constant expression way;
Wherein, consistency loss function is introduced in convolutional layer to push network to learn the table of consistency on characteristic information
It reaches, introduces consistency loss function in full articulamentum and push the network to learn the expression of consistency in semantic information,
Softmax layers of introducing consistency loss function push network to learn the expression of consistency in classification information.
As seen from the above technical solution provided by the invention, by successively introduced feature level, semantic hierarchies, and
The expression consistency optimization aim of tag along sort level, enables expression of the convolutional Neural network model on these three levels
There is robustness to transformation.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment
Attached drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for this
For the those of ordinary skill in field, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is the schematic diagram of convolutional neural networks provided in an embodiment of the present invention;
Fig. 2 is the schematic diagram before and after picture provided in an embodiment of the present invention progress basic transformation;
Fig. 3 is the frame provided in an embodiment of the present invention for having and converting constant ability and the consistent convolutional neural networks of expression
Figure;
Fig. 4 is the contrast schematic diagram of RC-CNN provided in an embodiment of the present invention and master mould and data enhanced scheme.
Specific embodiment
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this
The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, belongs to protection scope of the present invention.
The embodiment of the present invention, which provides one kind, has the constant ability of transformation and the consistent convolutional neural networks (RC-CNN) of expression,
Before introducing RC-CNN, it is introduced first against the basic transformation of existing convolutional neural networks (CNNs) and image.
1, convolutional neural networks
Convolutional neural networks (CNNs) are a kind of multi-level deep neural networks.In layers, different by learning
Then convolution kernel carries out process of convolution using Feature Mapping of these convolution kernels to preceding layer, obtains as feature extraction operator
The Feature Mapping of current layer.For the Feature Mapping of lower level mainly learn be it is some relatively low layers characteristic information, than
Such as edge and angle point.With the gradually intensification of level, information expressed by each layer of Feature Mapping is gradually abstracted.It is different
Layer in feature representation also represent picture it is at all levels on characteristic information.Shared and spatially the pondization of weight operates
It can make convolutional neural networks that there is invariance to some local small spatial alternations.Meanwhile the parameter of model also can be with
Reduction.In convolutional neural networks, the operation of convolutional layer can be expressed by following formula:
Wherein, * represents convolution symbol, Xi-1It is (i-1)-th layer of Feature Mapping, Wi jIt is i-th layer of j-th of convolution kernel,It is
The amount of bias of i-th layer of j-th of feature representation;Wi jWithIt can learn to obtain by gradient descent algorithm.F () is one non-
Linear function, such as ReLU function, Sigmoid function or Tanh function etc..
The operation of full articulamentum and the operation of convolution are essentially the same, in addition to convolution symbol * has been changed to the symbol of matrix multiple
×, following formula:
It is as shown in Figure 1 the schematic diagram of convolutional neural networks (CNNs);It includes convolutional layer (C1~C5), full articulamentum
(FC6~FC8) and Softmax layers.
The operation of convolution can carry out the picture of input from low layer to high-rise feature extraction.The operation of full articulamentum can
The expression of more high-level semantic level, last full connection are further abstracted into the expression in the feature level by picture
FC8 layer of layer export after would generally connect one Softmax layers, his output is network for predicting the confidence level of each classification.
2, the basic transformation of image
In the embodiment of the present invention, the basic transformation of targeted image is mainly some basic spatial alternations, wherein wrapping
Rotation is included, is translated, scaling etc..It is assumed that the coordinate of original image is (x, y), it is (x ', y ') by transformed Picture Coordinate.
So the transformation of picture can be realized by following formula:
(x ', y ', 1)=(x, y, 1) × T;
Wherein T is the transformation matrix of picture;
The transformation matrix T of rotationRFollowing formula:
Wherein, θ is the angle of rotation.
The transformation matrix T of translationTFollowing formula:
Wherein, dxAnd dyIt is the number for the pixel that picture is translated up in the direction x and the side y respectively.
The transformation matrix T of scalingSFollowing formula:
Wherein, sxAnd syIt is the ratio that picture scales on the direction x and the direction y respectively.
The transformation matrix T that all transformation are all addedRTSIt can be got by three above-mentioned matrix multiples:
TRTS=TR×TT×TS
As shown in Fig. 2, carrying out the example before and after basic transformation for picture;ORI column is original image;R column is
Postrotational picture;T column is the picture after translation;S column is the picture after scaling;RTS represents three kinds of transformation simultaneously
It has been introduced into picture.
Although convolutional neural networks have invariance for some local, small spatial alternation.But when picture is by complete
After office and biggish transformation, convolutional neural networks just not robust.Therefore, one kind provided by the embodiment of the present invention, which has, becomes
Change constant ability (i.e. for transformed picture, can still accurately identify, and then realize subsequent classification, search operaqtion) and
Express consistent convolutional neural networks, it is only necessary to which the loss function that invariance is introduced during trained may make training
Good model is for transformed picture more robust.Meanwhile this method can make model learning to the expression constant to transformation
Mode, only learns the mapping to transformation picture to set label compared to traditional method, and this method can be moved to preferably
In the problem of other deep learnings.In addition, loss function of this method by introducing consistency, so that constant ability will be converted
It is embedded into the weighting parameter of network, is truly to improve the Inalterability of displacement of convolutional neural networks, does not have in model
New parameter is introduced, without doing extra process to picture, does not need to change existing network structure in test.
As shown in figure 3, to be a kind of with the frame diagram for converting constant ability and the consistent convolutional neural networks of expression;Training
Stage, comprising in convolutional layer, full articulamentum and Softmax layers of convolutional neural networks introduce consistency loss function,
So that the convolutional neural networks after training learn to the expression way constant to transformation;
Wherein, consistency loss function is introduced in convolutional layer to push network to learn the table of consistency on characteristic information
It reaches;Consistency loss function is introduced in full articulamentum to push network to learn the expression of consistency in semantic information, so that net
Network can be consistent as far as possible in the expression of semantic information;Network is pushed in Softmax layers of introducing consistency loss function
The expression for learning consistency in classification information, so that the expression as far as possible of classification information is consistent.
Fig. 3 is seen also, in the training stage, the stochastic transformation T ' () of two ways is carried out for the samples pictures X of input
" (), obtained transformed picture are denoted as X ' and X " with T;
I-th layer of consistency loss function in convolutional neural networks, is added in the feature representation of picture X ' and X " at i-th layer
Feai(X ') and FeaiBetween (X "), indicate are as follows:
In above formula, LiIndicate i-th layer of consistency loss function.
The loss function of entire convolutional neural networks indicates are as follows:
LAll=λCls×(L′Cls+L″Cls)+∑λi×Li;
Wherein, coefficient lambdaiFor weighing i-th layer of consistency loss function Li, L 'ClsWith L "ClsCorrespond respectively to picture X '
With X " Classification Loss, coefficient lambdaClsFor weighing the Classification Loss L of samples pictures XCls, it is assumed that total classification of classification is N, then
LClsIt is exactly the loss of the Softmax layer of N output.
In the embodiment of the present invention, above-mentioned i-th layer refers to i-th layer of whole network, and not have to distinguish be specifically convolutional layer, complete
Articulamentum or Softmax layers.
In Fig. 3, the T ' (X) and T " (X), which refers to, carries out stochastic transformation T ' () and T to samples pictures X " in left side(·);It is intermediate
The label " L_Conv1, L_Conv2 ..., L_FC8 " occurred on a series of arrows upward in part, which respectively indicates, is added in difference
Loss function on layer, such as L_Conv1 refer to the loss function on first convolutional layer.The L_Cls presentation class of the rightmost side
Loss function.The Ground truth of X of lower section indicates the true classification of samples pictures X.
After completing training through the above way, can obtain has the constant ability of transformation and the consistent convolutional Neural net of expression
The good test picture of pre-transform is directly sent into network by network, test phase, can output category result.
As shown in figure 4, enhancing (data for RC-CNN and master mould (original model) and data
Augmentation contrast schematic diagram).Wherein, (a) is the distribution of original image Feature Mapping in master mould.(b) after for transformation
Picture by data enhancing training after model on Feature Mapping distribution, it is seen that even if by data enhancing,
Internal expression also some be aliasing together and be not easy separated.(c) there is the constant energy of transformation to be provided by the invention
Power and the consistent convolutional neural networks of expression, the expression mapped by propulsive characteristics is consistent, even so that figure after transformation
Piece also can be distinguished more easily.
In order to which RC-CNN provided by the invention makes comparisons with other current the best ways, carried out in two tasks
Comparative experiments.One is large-scale picture recognition task, another is picture retrieval task.By RC-CNN respectively with tradition
Convolutional neural networks, the enhanced convolutional neural networks of data, the models such as SI-CNN, TI-CNN, ST-CNN compare.
In extensive picture recognition problem, we use the data of ILSVRC-2012.According to figure in the data set
The content of piece has been divided into 1000 classes, is a subset of ImageNet.A total of 1.3M picture of training set, verifying collection have altogether
There are 50,000 pictures, test set there are 100,000 pictures.The accuracy rate of identification is generally judged by two indices, top-1 accuracy rate and
Top-5 accuracy rate.Wherein top-1 represents the highest prediction of confidence level in prediction result and the consistent probability of concrete class.top-5
Represent probability of the actual classification inside first five prediction of confidence level.Contrast and experiment is as shown in 1~table of table 2.
Result (top1/top5) on the ILSVRC-2012 data set of table 1 after the conversion
In above formula comparative experiments, consistency loss function is added in label level (RC-CNN (Cls)), feature representation respectively
Level and label level (RC-CNN (Conv+Cls)), semantic hierarchies and label level (RC-CNN (FC+Cls)) and all layers
Secondary (RC-CNN (Conv+FC+Cls)).It can be seen that can reach on the whole when consistency loss is added in all levels
Optimal result.
Result (top1/top5) of the table 2 on original ILSVRC-2012 data set
From the above result that it can be seen that, it is current best as a result, convolution is effectively promoted in RC-CNN to compare other
Invariance of the neural network for transformation.Meanwhile result of the RC-CNN in the data set of original image does not reduce not only, instead
And have a certain upgrade, illustrate RC-CNN not and be only over-fitting prediction of the transformed picture to true tag.
In the picture retrieval the problem of, using UK-Bench data set, which is used exclusively for the one of picture retrieval
A data set.Include wherein 2550 groups of pictures, every group of picture have 4 pictures, this four picture be from the same article or
The different visual angles of person's scene.Task in this data set is exactly to use in data set any one picture in entire data
Concentrate remaining three pictures of other in same group of retrieval.In order to verify effect of the RC-CNN in large-scale data,
1,000,000 additional pictures in MIRFlickr data set, which are brought, does negative sample.Use picture classification times above
The good model of pre-training in business, for these data, not in re -training or fine tuning.Pictures all in the data set are sent into
The model simultaneously extracts the feature representation after its L2 normalization.Then all pictures in one of feature representation and data set are calculated
The Euclidean distance of feature representation, and arranged by ascending order.4 nearest pictures of distance are used to calculate NS-Score.NS-Score
It is that represent is Average Accuracy apart from immediate four picture.For example: if four pictures all come from just
True group, then he can obtain 4.0 score for the figure.Experimental result is as shown in table 3.
Result on 3 UK-Bench data set of table
It can be seen that carrying out RC-CNN can obtain in different tasks in the result on the data set of image retrieval
It is obviously improved, illustrates that the present invention has certain transportable ability.
Above scheme provided in an embodiment of the present invention, main thought are by introducing the consistent of three levels in training
Property optimization aim come so that network to transformation have certain robustness.By using the optimization method, when the picture of input passes through
It crosses after certain transformation, it can be clearly seen that the invariance of the feature representation of network internal is effectively improved, into
And enable network more steady when identifying to picture.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can
The mode of necessary general hardware platform can also be added to realize by software by software realization.Based on this understanding,
The technical solution of above-described embodiment can be embodied in the form of software products, which can store non-easy at one
In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Within the technical scope of the present disclosure, any changes or substitutions that can be easily thought of by anyone skilled in the art,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims
Subject to enclosing.
Claims (3)
1. one kind, which has, converts constant ability and the consistent convolutional neural networks of expression characterized by comprising
Training stage, in the damage comprising introducing consistency in convolutional layer, full articulamentum and Softmax layers of convolutional neural networks
Function is lost, so that the convolutional neural networks after training learn to the expression way constant to transformation;
Wherein, introducing consistency loss function in convolutional layer pushes the network to learn the expression of consistency on characteristic information,
Full articulamentum introduces consistency loss function to push network to learn the expression of consistency in semantic information, at Softmax layers
Consistency loss function is introduced to push network to learn the expression of consistency in classification information.
2. one kind according to claim 1, which has, converts constant ability and the consistent convolutional neural networks of expression, feature
It is,
In the training stage, the samples pictures X of the input stochastic transformation T ' () and T " () for carrying out two ways is obtained
Transformed picture is denoted as X ' and X ";
I-th layer of consistency loss function in convolutional neural networks is added in the feature representation Fea of picture X ' and X " at i-th layeri
(X ') and FeaiBetween (X "), indicate are as follows:
In above formula, LiIndicate i-th layer of consistency loss function.
3. one kind according to claim 2, which has, converts constant ability and the consistent convolutional neural networks of expression, feature
It is, the loss function of entire convolutional neural networks indicates are as follows:
LAll=λCls×(L′Cls+L″Cls)+∑λi×Li;
Wherein, coefficient lambdaiFor weighing i-th layer of consistency loss function Li, L 'ClsWith L "ClsCorrespond respectively to picture X ' and X "
Classification Loss, coefficient lambdaClsFor weighing the Classification Loss L of samples pictures XCls。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810861718.8A CN109146058B (en) | 2018-07-27 | 2018-07-27 | Convolutional neural network with transform invariant capability and consistent expression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810861718.8A CN109146058B (en) | 2018-07-27 | 2018-07-27 | Convolutional neural network with transform invariant capability and consistent expression |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109146058A true CN109146058A (en) | 2019-01-04 |
CN109146058B CN109146058B (en) | 2022-03-01 |
Family
ID=64799291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810861718.8A Active CN109146058B (en) | 2018-07-27 | 2018-07-27 | Convolutional neural network with transform invariant capability and consistent expression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109146058B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110633790A (en) * | 2019-09-19 | 2019-12-31 | 郑州大学 | Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203420A (en) * | 2016-07-26 | 2016-12-07 | 浙江捷尚视觉科技股份有限公司 | A kind of bayonet vehicle color identification method |
CN106897714A (en) * | 2017-03-23 | 2017-06-27 | 北京大学深圳研究生院 | A kind of video actions detection method based on convolutional neural networks |
CN107145900A (en) * | 2017-04-24 | 2017-09-08 | 清华大学 | Pedestrian based on consistency constraint feature learning recognition methods again |
WO2017214968A1 (en) * | 2016-06-17 | 2017-12-21 | Nokia Technologies Oy | Method and apparatus for convolutional neural networks |
US9971940B1 (en) * | 2015-08-10 | 2018-05-15 | Google Llc | Automatic learning of a video matching system |
CN108257115A (en) * | 2018-04-13 | 2018-07-06 | 中山大学 | Image enhancement detection method and system based on orientation consistency convolutional neural networks |
CN108280411A (en) * | 2018-01-10 | 2018-07-13 | 上海交通大学 | A kind of pedestrian's searching method with spatial alternation ability |
-
2018
- 2018-07-27 CN CN201810861718.8A patent/CN109146058B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9971940B1 (en) * | 2015-08-10 | 2018-05-15 | Google Llc | Automatic learning of a video matching system |
WO2017214968A1 (en) * | 2016-06-17 | 2017-12-21 | Nokia Technologies Oy | Method and apparatus for convolutional neural networks |
CN106203420A (en) * | 2016-07-26 | 2016-12-07 | 浙江捷尚视觉科技股份有限公司 | A kind of bayonet vehicle color identification method |
CN106897714A (en) * | 2017-03-23 | 2017-06-27 | 北京大学深圳研究生院 | A kind of video actions detection method based on convolutional neural networks |
CN107145900A (en) * | 2017-04-24 | 2017-09-08 | 清华大学 | Pedestrian based on consistency constraint feature learning recognition methods again |
CN108280411A (en) * | 2018-01-10 | 2018-07-13 | 上海交通大学 | A kind of pedestrian's searching method with spatial alternation ability |
CN108257115A (en) * | 2018-04-13 | 2018-07-06 | 中山大学 | Image enhancement detection method and system based on orientation consistency convolutional neural networks |
Non-Patent Citations (3)
Title |
---|
XU SHEN等: "T ransform-Invariant Convolutional Neural Networks for Image Classification and Search", 《ACM》 * |
卢官明等: "一种用于人脸表情识别的卷积神经网络", 《南京邮电大学学报(自然科学版)》 * |
李洁樱: "基于孪生卷积神经网络的车辆一致性判别方法", 《中国交通信息化》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110633790A (en) * | 2019-09-19 | 2019-12-31 | 郑州大学 | Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network |
CN110633790B (en) * | 2019-09-19 | 2022-04-08 | 郑州大学 | Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network |
Also Published As
Publication number | Publication date |
---|---|
CN109146058B (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804530B (en) | Subtitling areas of an image | |
CN107251059A (en) | Sparse reasoning module for deep learning | |
Wu et al. | Application of image retrieval based on convolutional neural networks and Hu invariant moment algorithm in computer telecommunications | |
CN106203483B (en) | A kind of zero sample image classification method based on semantic related multi-modal mapping method | |
Ahmad et al. | Data augmentation-assisted deep learning of hand-drawn partially colored sketches for visual search | |
CN105718940B (en) | The zero sample image classification method based on factorial analysis between multiple groups | |
Magassouba et al. | Understanding natural language instructions for fetching daily objects using gan-based multimodal target–source classification | |
CN109271539A (en) | A kind of image automatic annotation method and device based on deep learning | |
CN110349229A (en) | A kind of Image Description Methods and device | |
Rad et al. | Image annotation using multi-view non-negative matrix factorization with different number of basis vectors | |
CN108154156B (en) | Image set classification method and device based on neural topic model | |
CN113051914A (en) | Enterprise hidden label extraction method and device based on multi-feature dynamic portrait | |
CN104504406B (en) | A kind of approximate multiimage matching process rapidly and efficiently | |
CN111985520A (en) | Multi-mode classification method based on graph convolution neural network | |
Liu et al. | Learning a representative and discriminative part model with deep convolutional features for scene recognition | |
Zhou et al. | Sampling-attention deep learning network with transfer learning for large-scale urban point cloud semantic segmentation | |
CN103617609A (en) | A k-means nonlinear manifold clustering and representative point selecting method based on a graph theory | |
Menaga et al. | Deep learning: a recent computing platform for multimedia information retrieval | |
CN112651940A (en) | Collaborative visual saliency detection method based on dual-encoder generation type countermeasure network | |
Wang et al. | A deep clustering via automatic feature embedded learning for human activity recognition | |
Tavakoli | Seq2image: Sequence analysis using visualization and deep convolutional neural network | |
Pei et al. | Unsupervised multimodal feature learning for semantic image segmentation | |
CN114329031A (en) | Fine-grained bird image retrieval method based on graph neural network and deep hash | |
Wang et al. | Multiscale convolutional neural networks with attention for plant species recognition | |
CN104778272B (en) | A kind of picture position method of estimation excavated based on region with space encoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |