CN109146058B - Convolutional neural network with transform invariant capability and consistent expression - Google Patents

Convolutional neural network with transform invariant capability and consistent expression Download PDF

Info

Publication number
CN109146058B
CN109146058B CN201810861718.8A CN201810861718A CN109146058B CN 109146058 B CN109146058 B CN 109146058B CN 201810861718 A CN201810861718 A CN 201810861718A CN 109146058 B CN109146058 B CN 109146058B
Authority
CN
China
Prior art keywords
layer
convolutional neural
expression
neural network
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810861718.8A
Other languages
Chinese (zh)
Other versions
CN109146058A (en
Inventor
田新梅
何岸峰
沈旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201810861718.8A priority Critical patent/CN109146058B/en
Publication of CN109146058A publication Critical patent/CN109146058A/en
Application granted granted Critical
Publication of CN109146058B publication Critical patent/CN109146058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns

Abstract

The invention discloses a convolutional neural network with transformation invariance capability and consistent expression, and a trained model can be more robust to a transformed picture only by introducing an invariance loss function in the training process. Meanwhile, the method can enable the model to learn the expression mode with unchanged conversion, and compared with the traditional method which only learns the mapping from the converted picture to the set label, the method can be better transferred to other deep learning problems. In addition, the method embeds the transformation invariance capability into the weight parameters of the network, thereby really improving the transformation invariance of the convolutional neural network, introducing no new parameters into the model, not needing to additionally process the picture, and not needing to change the existing network structure during testing.

Description

Convolutional neural network with transform invariant capability and consistent expression
Technical Field
The invention relates to the technical fields of image classification, image retrieval and the like, in particular to a convolutional neural network with transformation invariability and consistent expression.
Background
In recent years, with the rapid development of the internet, people can be exposed to massive pictures and videos. For these huge amount of pictures, how to accurately identify and retrieve is the basis of all picture-related applications. In the past, limited by relatively insufficient computing power, only a few relatively low-level feature extraction algorithms can be used, and the algorithms cannot accurately express high-level semantic information of pictures. With the improvement of computing power, the brought deep learning technology makes breakthrough progress in a series of related fields such as image recognition, picture retrieval and the like. The convolutional neural network is mainly used in applications such as picture recognition and retrieval. The operation of convolution, pooling and the like enables the model to extract feature expressions from local to global layer by layer. Compared with the traditional method, the technology has the advantages that accurate expression of high-level semantics enables the technology to greatly surpass the traditional algorithm in recognition performance.
However, existing convolutional neural networks are not particularly robust to pictures that have undergone various spatial transformations. After the output of the network middle layer is visualized, it can be seen that after the input picture is rotated, zoomed, or translated, the feature expression difference of each layer is quite large, and therefore, the recognition accuracy rate is also rapidly reduced.
The existing methods mainly solve this problem from three perspectives: the first method is mainly to enhance the data set during training, so that the model can be fully learned on various transformed pictures during training. The diversity of samples can be increased after the processing, so that the robustness of the model on various transformed pictures is improved immediately. The second method is to input various transformed pictures into a multi-channel structure, perform maximum pooling operation in the feature mapping output of each channel, and use the feature mapping obtained by the maximum pooling as the feature expression of the picture. The third method is to learn the transformation of the picture by an additional neural network, and inversely transform the picture back to a more standard pose according to the transformation, and then classify the picture with the standard pose. Therefore, the image recognition effect can be improved.
However, for the three methods, either the training time is increased or additional parameters and operations are introduced, so that the operation complexity is increased during recognition. Meanwhile, if the robustness of the network to the transformation is increased by modifying the structure, the existing network structure also needs to be modified when the network is applied, which is not beneficial to the transplantation of the model.
Disclosure of Invention
The invention aims to provide a convolutional neural network with transformation invariance capability and consistent expression, so that the invariance of characteristic expression in the network is effectively improved, and the network can be more stable when pictures are identified.
The purpose of the invention is realized by the following technical scheme:
a convolutional neural network with transform invariant capability and uniform expression, comprising:
in the training stage, a consistent loss function is introduced into a convolutional neural network comprising a convolutional layer, a full-link layer and a Softmax layer, so that the trained convolutional neural network learns an expression mode which is invariant to transformation;
the consistency loss function is introduced into the convolutional layer to promote the network to learn the expression of consistency on the characteristic information, the consistency loss function is introduced into the full connection layer to promote the network to learn the expression of consistency on the semantic information, and the consistency loss function is introduced into the Softmax layer to promote the network to learn the expression of consistency on the classification information.
According to the technical scheme provided by the invention, the expression consistency optimization targets of the feature level, the semantic level and the classification label level are introduced in sequence, so that the convolutional neural network model can have robustness on the expression of the three levels to the transformation.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
FIG. 1 is a schematic diagram of a convolutional neural network provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a picture before and after a basic transformation according to an embodiment of the present invention;
FIG. 3 is a block diagram of a convolutional neural network with transform invariant capability and consistent expression provided by an embodiment of the present invention;
fig. 4 is a schematic diagram illustrating comparison between an RC-CNN and an original model and a data enhancement scheme according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a convolutional neural network (RC-CNN) with transformation invariability and consistent expression, which firstly introduces the existing Convolutional Neural Networks (CNNs) and the basic transformation of images before introducing the RC-CNN.
1. Convolutional neural network
Convolutional Neural Networks (CNNs) are a class of deep neural networks of multiple levels. In each layer, different convolution kernels are learned to be used as characteristic extraction operators, and then the convolution kernels are used for carrying out convolution processing on the characteristic mapping of the previous layer, so that the characteristic mapping of the current layer is obtained. For the feature mapping of the lower layer, mainly learning some feature information of the lower layer, such as edges and corners, etc. The information expressed by the feature map of each layer is gradually abstracted as the hierarchy is gradually deepened. The feature expressions in different layers also represent feature information on the respective layers of the picture. The sharing of weights and pooling over space can make the convolutional neural network invariant to some local small spatial transformations. At the same time, the parameters of the model are reduced. In a convolutional neural network, the operation of a convolutional layer can be expressed by the following formula:
Figure BDA0001745981570000031
wherein, represents a convolution symbol, Xi-1Is a feature map of layer i-1, Wi jIs the jth convolution kernel of the ith layer,
Figure BDA0001745981570000032
is the offset of the jth feature expression of the ith layer; wi jAnd
Figure BDA0001745981570000033
and the learning can be achieved through a gradient descent algorithm. f (-) is a non-linear function such as ReLU function, Sigmoid function or Tanh function, etc.
The operation of the fully connected layer is substantially the same as the operation of convolution except that the sign of convolution is changed to the sign x of matrix multiplication, as follows:
Figure BDA0001745981570000034
FIG. 1 is a schematic diagram of Convolutional Neural Networks (CNNs); it includes a convolutional layer (C1-C5), a full-link layer (FC 6-FC 8) and a Softmax layer.
The operation of convolution may perform feature extraction from a lower layer to a higher layer on the input picture. The operation of the full connection layer can further abstract the expression on the feature level of the picture into a higher-level expression on the semantic level, and the output of the last full connection layer FC8 layer is usually followed by a Softmax layer, and the output is the confidence of the network for predicting each category.
2. Fundamental transformation of images
In the embodiment of the present invention, the basic transformation of the aimed image is mainly some basic spatial transformations, including rotation, translation, scaling, and the like. The coordinates of the original picture are assumed to be (x, y), and the coordinates of the transformed picture are assumed to be (x ', y'). The transformation of the picture can be implemented by the following formula:
(x′,y′,1)=(x,y,1)×T;
wherein T is a transformation matrix of the picture;
rotating transformation matrix TRThe following equation:
Figure BDA0001745981570000041
where θ is the angle of rotation.
Translational transformation matrix TTThe following equation:
Figure BDA0001745981570000042
wherein d isxAnd dyThe number of pixel points the picture is shifted in the x-direction and the y-direction, respectively.
Scaled transformation matrix TSThe following equation:
Figure BDA0001745981570000043
wherein s isxAnd syWhich are the scales at which the picture is scaled in the x-direction and the y-direction, respectively.
Transformation matrix T added for all transformationsRTSCan be obtained by multiplying the three matrices:
TRTS=TR×TT×TS
as shown in fig. 2, an example of a picture before and after a basic transformation; ORI is in the original picture; the column of R is the rotated picture; the column of T is the translated picture; the column of S is the zoomed picture; RTS represents the simultaneous introduction of three transforms into a picture.
Although convolutional neural networks are invariant to some local, small spatial transformations. But after the picture is transformed globally and largely, the convolutional neural network is not robust. Therefore, the convolutional neural network provided by the embodiment of the invention has the capability of transforming invariance (namely, the transformed picture can still be accurately identified, and further subsequent classification and retrieval operations are realized), and the convolutional neural network is consistent in expression, so that the trained model can be more robust to the transformed picture only by introducing an invariance loss function in the training process. Meanwhile, the method can enable the model to learn the expression mode with unchanged conversion, and compared with the traditional method which only learns the mapping from the converted picture to the set label, the method can be better transferred to other deep learning problems. In addition, the method embeds the unchanged conversion capability into the weight parameters of the network by introducing the loss function of consistency, thereby really improving the invariance of the conversion of the convolutional neural network, introducing no new parameters into the model, not needing to additionally process the picture, and not needing to change the existing network structure during testing.
FIG. 3 is a block diagram of a convolutional neural network with transform invariant capability and consistent expression; in the training stage, a consistent loss function is introduced into a convolutional neural network comprising a convolutional layer, a full-link layer and a Softmax layer, so that the trained convolutional neural network learns an expression mode which is invariant to transformation;
introducing a consistency loss function into the convolutional layer to promote the network to learn the expression of consistency on the characteristic information; a consistency loss function is introduced into the full connection layer to promote the network to learn the expression of consistency on the semantic information, so that the network can be consistent on the expression of the semantic information as much as possible; and introducing a consistency loss function in a Softmax layer to push a network to learn the expression of consistency on the classification information, so that the expression of the classification information is consistent as much as possible.
Referring also to fig. 3, in the training phase, two ways of random transformation T '() and T "() are performed on the input sample picture X, and the resulting transformed picture is denoted as X' and X";
the consistency loss function of the ith layer in the convolutional neural network is added to the characteristic expression Fea of the pictures X 'and X' at the ith layeri(X') with Feai(X'), which is expressed as:
Figure BDA0001745981570000051
in the above formula, LiRepresenting the uniformity loss function for the ith layer.
The loss function of the entire convolutional neural network is expressed as:
LAll=λCls×(L′Cls+L″Cls)+∑λi×Li
wherein the coefficient lambdaiUsed to weigh the i-th layer's consistency loss function Li,L′ClsAnd L ″)ClsCorresponding to the classification losses of pictures X' and X ", respectively, by a factor λClsClassification loss L used to weigh sample pictures XClsAssuming that the overall class of classification is N, then LClsIs the loss of the Softmax layer for one N output.
In the embodiment of the present invention, the i-th layer refers to the i-th layer of the entire network, and is not particularly distinguished from the convolutional layer, the fully-connected layer, or the Softmax layer.
In fig. 3, T '(X) and T ″ (X) on the left side refer to random transformations T' (·) and T ″, performed on sample picture X(·)(ii) a The labels "L _ Conv1, L _ Conv 2.., L _ FC 8" appearing on the series of arrows in the middle section facing upwards respectively represent the loss functions applied to the different layers, for example L _ Conv1 refers to the loss function on the first convolutional layer. The rightmost L Cls represents the classification loss function. The lower group of X indicates the true category of the sample picture X.
After the training is completed in the mode, the convolutional neural network which has the transformation invariance capability and is consistent in expression can be obtained, and the classification result can be output by directly sending the pre-transformed test picture into the network in the test stage.
FIG. 4 is a diagram illustrating the comparison between RC-CNN and original model (original model) and data augmentation (data augmentation). Wherein, (a) is the distribution of feature mapping of the original image in the original model. (b) For the distribution of the feature mapping of the transformed picture on the model trained by data enhancement, it can be seen that even through data enhancement, some of the internal expressions are mixed together and are not easily separated. (c) The convolutional neural network with the transformation invariance capability and consistent expression provided by the invention has the advantage that the images even after transformation can be more conveniently distinguished by promoting the consistent expression of feature mapping.
To compare the RC-CNN provided by the present invention with the best other methods today, comparative experiments were performed on two tasks. One is a large-scale picture recognition task, and the other is a picture retrieval task. And respectively comparing the RC-CNN with models such as a traditional convolutional neural network, a data-enhanced convolutional neural network, SI-CNN, TI-CNN, ST-CNN and the like.
In the large-scale picture recognition problem, we use the data of ILSVRC-2012. The data set is divided into 1000 classes according to the content of the pictures, and the classes are a subset of ImageNet. The training set has 1.3M pictures, the verification set has 5 ten thousand pictures, and the test set has 10 ten thousand pictures. The accuracy of recognition is generally judged by two indexes, top-1 accuracy and top-5 accuracy. Where top-1 represents the probability that the most confident prediction in the prediction results is consistent with the actual category. top-5 represents the probability that the actual class is within the prediction of the top five confidence. The results of the comparative experiments are shown in tables 1 to 2.
Figure BDA0001745981570000061
TABLE 1 results on the transformed ILSVRC-2012 dataset (top1/top5)
In the comparative experiment of the above formula, the consistency loss function is added to the label level (RC-CNN (Cls)), the feature expression level and the label level (RC-CNN (Conv + Cls)), the semantic level and the label level (RC-CNN (FC + Cls)) and all the levels (RC-CNN (Conv + FC + Cls)), respectively. It can be seen that when the loss of consistency is added at all levels, the overall best results can be achieved.
Figure BDA0001745981570000071
TABLE 2 results on the original ILSVRC-2012 dataset (top1/top5)
From the above results, it can be seen that the RC-CNN effectively improves the invariance of the convolutional neural network to the transformation, compared to the best results of the other days. Meanwhile, the result of the RC-CNN in the data set of the original picture is not reduced, but is improved to a certain extent, which shows that the RC-CNN is not only used for over-fitting the transformed picture to the prediction of the real label.
In the problem of picture retrieval, a UK-Bench dataset is used, which is a dataset that is dedicated to picture retrieval. It contains 2550 sets of pictures, each set of 4 pictures, all from different perspectives of the same item or scene. The task in this dataset is to use any one picture in the dataset to retrieve the other three remaining pictures in the same group throughout the dataset. To verify the effect of RC-CNN in large-scale data, an additional 1,000,000 pictures in the mirlickr dataset were taken as negative examples. The pre-trained models used in the picture classification task are not retrained or fine-tuned for these data. All pictures in the data set are fed into the model and the L2 normalized feature expression is extracted. And then calculating the Euclidean distance between one feature expression and all picture feature expressions in the data set, and arranging the Euclidean distances in ascending order. The nearest 4 pictures were used to calculate NS-Score. NS-Score is the average accuracy representing the four closest pictures. For example: if all four pictures are from the correct group, he can get a score of 4.0 for that picture. The results of the experiment are shown in table 3.
Figure BDA0001745981570000072
TABLE 3 results on UK-Bench dataset
The result on the image retrieval data set can show that RC-CNN can be obviously improved in different tasks, which shows that the invention has certain migratable capability.
The main idea of the above scheme provided by the embodiment of the present invention is to make the network have a certain robustness to transformation by introducing three levels of consistency optimization targets in training. By using the optimization method, after the input picture is transformed to a certain degree, the invariance of the feature expression in the network can be clearly seen to be effectively improved, and the network can be more stable when the picture is identified.
Through the above description of the embodiments, it is clear to those skilled in the art that the above embodiments can be implemented by software, and can also be implemented by software plus a necessary general hardware platform. With this understanding, the technical solutions of the embodiments can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (1)

1. A convolutional neural network implementation method with transform invariant capability and consistent expression, comprising:
in the training stage, a consistent loss function is introduced into a convolutional neural network comprising a convolutional layer, a full-link layer and a Softmax layer, so that the trained convolutional neural network learns an expression mode which is invariant to transformation;
the consistency loss function is introduced into the convolutional layer to promote the network to learn the expression of consistency on the characteristic information, the consistency loss function is introduced into the full connection layer to promote the network to learn the expression of consistency on the semantic information, and the consistency loss function is introduced into the Softmax layer to promote the network to learn the expression of consistency on the classification information;
in the training stage, randomly transforming T '() and T' () in two modes on an input sample picture X, and recording the obtained transformed picture as X 'and X';
the consistency loss function of the ith layer in the convolutional neural network is added to the characteristic expression Fea of the pictures X 'and X' at the ith layeri(X') with Feai(X'), which is expressed as:
Figure FDA0003288314130000011
in the above formula, LiRepresenting a consistency loss function of the ith layer;
the loss function of the entire convolutional neural network is expressed as:
LAll=λCls×(L′Cls+L″Cls)+∑λi×Li
wherein the coefficient lambdaiUsed to weigh the i-th layer's consistency loss function Li,L′ClsAnd L ″)ClsCorresponding to the classification losses of pictures X' and X ", respectively, by a factor λClsClassification loss L used to weigh sample pictures XCls
CN201810861718.8A 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression Active CN109146058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810861718.8A CN109146058B (en) 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810861718.8A CN109146058B (en) 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression

Publications (2)

Publication Number Publication Date
CN109146058A CN109146058A (en) 2019-01-04
CN109146058B true CN109146058B (en) 2022-03-01

Family

ID=64799291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810861718.8A Active CN109146058B (en) 2018-07-27 2018-07-27 Convolutional neural network with transform invariant capability and consistent expression

Country Status (1)

Country Link
CN (1) CN109146058B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633790B (en) * 2019-09-19 2022-04-08 郑州大学 Method and system for measuring residual oil quantity of airplane oil tank based on convolutional neural network

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106203420A (en) * 2016-07-26 2016-12-07 浙江捷尚视觉科技股份有限公司 A kind of bayonet vehicle color identification method
CN106897714A (en) * 2017-03-23 2017-06-27 北京大学深圳研究生院 A kind of video actions detection method based on convolutional neural networks
CN107145900A (en) * 2017-04-24 2017-09-08 清华大学 Pedestrian based on consistency constraint feature learning recognition methods again
WO2017214968A1 (en) * 2016-06-17 2017-12-21 Nokia Technologies Oy Method and apparatus for convolutional neural networks
US9971940B1 (en) * 2015-08-10 2018-05-15 Google Llc Automatic learning of a video matching system
CN108257115A (en) * 2018-04-13 2018-07-06 中山大学 Image enhancement detection method and system based on orientation consistency convolutional neural networks
CN108280411A (en) * 2018-01-10 2018-07-13 上海交通大学 A kind of pedestrian's searching method with spatial alternation ability

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9971940B1 (en) * 2015-08-10 2018-05-15 Google Llc Automatic learning of a video matching system
WO2017214968A1 (en) * 2016-06-17 2017-12-21 Nokia Technologies Oy Method and apparatus for convolutional neural networks
CN106203420A (en) * 2016-07-26 2016-12-07 浙江捷尚视觉科技股份有限公司 A kind of bayonet vehicle color identification method
CN106897714A (en) * 2017-03-23 2017-06-27 北京大学深圳研究生院 A kind of video actions detection method based on convolutional neural networks
CN107145900A (en) * 2017-04-24 2017-09-08 清华大学 Pedestrian based on consistency constraint feature learning recognition methods again
CN108280411A (en) * 2018-01-10 2018-07-13 上海交通大学 A kind of pedestrian's searching method with spatial alternation ability
CN108257115A (en) * 2018-04-13 2018-07-06 中山大学 Image enhancement detection method and system based on orientation consistency convolutional neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
T ransform-Invariant Convolutional Neural Networks for Image Classification and Search;Xu Shen等;《ACM》;20161019;第1345-1354页 *
一种用于人脸表情识别的卷积神经网络;卢官明等;《南京邮电大学学报(自然科学版)》;20160228;第16-22页 *
基于孪生卷积神经网络的车辆一致性判别方法;李洁樱;《中国交通信息化》;20180430;第104-105页 *

Also Published As

Publication number Publication date
CN109146058A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
Zhu et al. A deep convolutional neural network approach to single-particle recognition in cryo-electron microscopy
CN107122809B (en) Neural network feature learning method based on image self-coding
CN111027563A (en) Text detection method, device and recognition system
US10867169B2 (en) Character recognition using hierarchical classification
Mohamed et al. Content-based image retrieval using convolutional neural networks
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
EP4002161A1 (en) Image retrieval method and apparatus, storage medium, and device
Ahmad et al. Data augmentation-assisted deep learning of hand-drawn partially colored sketches for visual search
TW202207077A (en) Text area positioning method and device
CN110008365B (en) Image processing method, device and equipment and readable storage medium
CN114067385A (en) Cross-modal face retrieval Hash method based on metric learning
Wei et al. Food image classification and image retrieval based on visual features and machine learning
Gao et al. SHREC’15 Track: 3D object retrieval with multimodal views
Jiang et al. MTFFNet: a multi-task feature fusion framework for Chinese painting classification
Pengcheng et al. Fast Chinese calligraphic character recognition with large-scale data
CN109146058B (en) Convolutional neural network with transform invariant capability and consistent expression
Sunitha et al. Novel content based medical image retrieval based on BoVW classification method
Xu et al. Chinese characters recognition from screen-rendered images using inception deep learning architecture
CN115640401B (en) Text content extraction method and device
US11816909B2 (en) Document clusterization using neural networks
Tomei et al. Image-to-image translation to unfold the reality of artworks: an empirical analysis
CN115100694A (en) Fingerprint quick retrieval method based on self-supervision neural network
Mauricio et al. High-resolution generative adversarial neural networks applied to histological images generation
Ameur et al. Hybrid descriptors and weighted PCA-EFMNet for face verification in the wild
Ouni et al. An efficient ir approach based semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant