CN117671704A - Handwriting digital recognition method, handwriting digital recognition device and computer storage medium - Google Patents
Handwriting digital recognition method, handwriting digital recognition device and computer storage medium Download PDFInfo
- Publication number
- CN117671704A CN117671704A CN202410130100.XA CN202410130100A CN117671704A CN 117671704 A CN117671704 A CN 117671704A CN 202410130100 A CN202410130100 A CN 202410130100A CN 117671704 A CN117671704 A CN 117671704A
- Authority
- CN
- China
- Prior art keywords
- matrix
- data
- projection matrix
- steps
- training data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 239000011159 matrix material Substances 0.000 claims abstract description 120
- 238000012549 training Methods 0.000 claims abstract description 49
- 238000005457 optimization Methods 0.000 claims abstract description 35
- 238000004458 analytical method Methods 0.000 claims abstract description 27
- 238000010606 normalization Methods 0.000 claims abstract description 6
- 239000013598 vector Substances 0.000 claims description 38
- 238000004364 calculation method Methods 0.000 claims description 15
- 230000003044 adaptive effect Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 3
- 238000000354 decomposition reaction Methods 0.000 claims 2
- 230000009466 transformation Effects 0.000 claims 2
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/22—Character recognition characterised by the type of writing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19127—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19173—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域Technical field
本发明涉及图像识别技术领域,特别是涉及一种手写体数字识别方法、装置及计算机存储介质。The present invention relates to the field of image recognition technology, and in particular to a handwritten digit recognition method, device and computer storage medium.
背景技术Background technique
线性判别分析方法(linear discriminant analysis,LDA)是经典的有监督学习算法,主要用于降维和分类。它的主要思想是将数据投影到新的空间,使得同类的数据尽可能的靠近,不同类的数据尽可能的远离。该方法可以用于解决图像分类问题,例如手写体数字识别等问题。线性判别分析方法通过将数据投影到最佳线性判别方向上,可以提高分类准确率。但是对于多分类问题,LDA并不是一个最优的选择。在同方差高斯假设下,LDA的投影是通过最大化不同类别之间的Kullback-Leibler (KL)散度的加权算术平均值得到的,其投影方向是由具有大KL散度的类对主导的,这导致了KL散度小的类对在投影空间中会发生重叠现象,从而使得分类的准确性也在显著退化。针对LDA在多分类问题中类分离的问题,许多研究人员提出各种构造权重的方案来优化LDA。这类有监督的判别分析方法主要分成两大类,一类是替代不同类对间KL散度的算术平均值,对KL散度不同的类对赋予不同的权重;另一类是关注相近类对的分离,强调KL散度小的类对。但这些方法都是有监督的,需要足够的标签数据才能训练出模型,而且容易产生过拟合问题。Linear discriminant analysis (LDA) is a classic supervised learning algorithm, mainly used for dimensionality reduction and classification. Its main idea is to project data into a new space so that data of the same type are as close as possible and data of different types are as far apart as possible. This method can be used to solve image classification problems, such as handwritten digit recognition. Linear discriminant analysis methods can improve classification accuracy by projecting data into the best linear discriminant direction. But for multi-classification problems, LDA is not an optimal choice. Under the homoscedastic Gaussian assumption, the projection of LDA is obtained by maximizing the weighted arithmetic mean of the Kullback-Leibler (KL) divergence between different classes, and its projection direction is dominated by class pairs with large KL divergence. , which results in overlapping of class pairs with small KL divergence in the projection space, resulting in significant degradation in classification accuracy. In response to the problem of class separation of LDA in multi-classification problems, many researchers have proposed various weight construction schemes to optimize LDA. This type of supervised discriminant analysis method is mainly divided into two categories. One is to replace the arithmetic mean of KL divergence between different class pairs and give different weights to class pairs with different KL divergence; the other is to focus on similar classes. The separation of pairs emphasizes class pairs with small KL divergence. However, these methods are all supervised, require sufficient labeled data to train the model, and are prone to overfitting problems.
随着科学技术的发展,收集数据的技术与工具在不断进步,大量的数据可以被使用,但是标签数据的标记工作还需要大量的人力和物力,所以如何利用无标签数据帮助提升已有算法的性能成为当前的研究热点。半监督学习就是利用大量的无标签数据来辅助少量的标签数据提升学习性能,从而得到泛化能力更强的学习模型。如何将有监督的判别分析方法扩展至半监督学习中得到更有效的分类模型成为亟待解决的任务之一。With the development of science and technology, the technology and tools for collecting data are constantly improving. A large amount of data can be used, but the labeling of labeled data still requires a lot of manpower and material resources. Therefore, how to use unlabeled data to help improve existing algorithms? Performance has become a current research hotspot. Semi-supervised learning uses a large amount of unlabeled data to assist a small amount of labeled data to improve learning performance, thereby obtaining a learning model with stronger generalization capabilities. How to extend the supervised discriminant analysis method to semi-supervised learning to obtain a more effective classification model has become one of the tasks that need to be solved urgently.
发明内容Contents of the invention
针对上述现有技术的缺陷,本发明提供了一种手写体数字识别方法,将有监督的判别分析方法扩展至半监督学习,解决多分类任务中标签数据少,采用传统判别分析方法存在的类分离问题。本发明的另一目的是提供一种手写体数字识别装置及相应的计算机存储介质。In view of the above-mentioned shortcomings of the prior art, the present invention provides a handwritten digit recognition method, which extends the supervised discriminant analysis method to semi-supervised learning, solving the problem of insufficient label data in multi-classification tasks and the class separation that occurs when using traditional discriminant analysis methods. question. Another object of the present invention is to provide a handwritten digit recognition device and a corresponding computer storage medium.
本发明技术方案如下:一种手写体数字识别方法,包括以下步骤:The technical solution of the present invention is as follows: a handwritten digit recognition method, including the following steps:
步骤S1、收集到的样本进行归一化处理得到训练数据,所述训练数据包括标签数据和无标签数据;Step S1. The collected samples are normalized to obtain training data. The training data includes labeled data and unlabeled data;
步骤S2、由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;Step S2: Calculate the intra-class divergence matrix and the inter-class divergence matrix from the label data in the training data;
步骤S3、由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;Step S3: Construct a neighbor graph from the labeled data and unlabeled data in the training data to calculate the manifold regularization term;
步骤S4、利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵,包括将拉普拉斯自适应权重判别分析方法的优化目标设置为Step S4: Use the training data to learn the optimal projection matrix through the Laplacian adaptive weight discriminant analysis method, including setting the optimization goal of the Laplacian adaptive weight discriminant analysis method to
, ,
, ,
其中为类内散度矩阵,/>为类间散度矩阵,/>为投影矩阵/>的L2,1范数,m为特征个数,d为投影空间的维度,/>为流形正则项,为权衡参数,/>为单位矩阵,/>为训练数据中标签信息的类别数;采用迭代优化的方法求解投影矩阵/>和权重向量/>,得到最优投影矩阵;in is the within-class divergence matrix,/> is the inter-class divergence matrix,/> is the projection matrix/> L 2,1 norm, m is the number of features, d is the dimension of the projection space,/> is the manifold regular term, To weigh parameters,/> is the identity matrix,/> is the number of categories of label information in the training data; an iterative optimization method is used to solve the projection matrix/> and weight vector/> , get the optimal projection matrix;
步骤S5、将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。Step S5: Normalize the sample to be identified, obtain the projected data through the optimal projection matrix, and then use the nearest neighbor classifier to obtain the identification label.
本发明还提供一种手写体数字识别装置,包括:The invention also provides a handwritten digit recognition device, which includes:
预处理模块:收集到的样本进行归一化处理得到训练数据,所述训练数据包括标签数据和无标签数据;Preprocessing module: The collected samples are normalized to obtain training data. The training data includes labeled data and unlabeled data;
第一计算模块:由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;The first calculation module: calculates the intra-class divergence matrix and the inter-class divergence matrix from the label data in the training data;
第二计算模块:由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;The second calculation module: constructs a neighbor graph from the labeled data and unlabeled data in the training data to calculate the manifold regularization term;
最优投影矩阵求解模块:利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵,包括将拉普拉斯自适应权重判别分析方法的优化目标设置为Optimal projection matrix solution module: Use training data to learn the optimal projection matrix through the Laplacian adaptive weight discriminant analysis method, including setting the optimization goal of the Laplacian adaptive weight discriminant analysis method to
, ,
, ,
其中为类内散度矩阵,/>为类间散度矩阵,/>为投影矩阵/>的L2,1范数,m为特征个数,d为投影空间的维度,/>为流形正则项,为权衡参数,/>为单位矩阵,/>为训练数据中标签信息的类别数;采用迭代优化的方法求解投影矩阵/>和权重向量/>,得到最优投影矩阵;in is the within-class divergence matrix,/> is the inter-class divergence matrix,/> is the projection matrix/> L 2,1 norm, m is the number of features, d is the dimension of the projection space,/> is the manifold regular term, To weigh parameters,/> is the identity matrix,/> is the number of categories of label information in the training data; an iterative optimization method is used to solve the projection matrix/> and weight vector/> , get the optimal projection matrix;
识别模块:将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。Recognition module: Normalize the sample to be identified, obtain the projected data through the optimal projection matrix, and then use the nearest neighbor classifier to obtain the identification label.
进一步地,所述步骤S3以及第二计算模块包括计算步骤:Further, the step S3 and the second calculation module include calculation steps:
步骤S3.1、利用训练数据构造近邻图得到近邻矩阵,标签数据/>,无标签数据/>,标签数据个数为/>,无标签数据个数为/>,近邻矩阵/>的构造方式如下:Step S3.1, using training data Construct the nearest neighbor graph to obtain the nearest neighbor matrix , label data/> , unlabeled data/> , the number of tag data is/> , the number of unlabeled data is/> ,nearest neighbor matrix/> is constructed as follows:
, ,
其中表示为/>的/>近邻集合;in Expressed as/> of/> neighbor set;
步骤S3.2、计算流形正则项中的拉普拉斯矩阵,其中/>为对角矩阵,对角元素/>,在投影空间中的得到流形正则项为Step S3.2, Calculate the Laplacian matrix in the manifold regularization term , of which/> is a diagonal matrix, diagonal elements/> , the obtained manifold regular term in the projection space is
, ,
其中为L2范数,/>表示为/>在低维投影空间中的像,/>,。in is the L 2 norm,/> Expressed as/> Image in low-dimensional projection space,/> , .
进一步地,所述采用迭代优化的方法求解投影矩阵和权重向量/>,得到最优投影矩阵,包括步骤:Further, the iterative optimization method is used to solve the projection matrix and weight vector/> , to obtain the optimal projection matrix, including steps:
步骤S4.1、初始化权重,求解投影矩阵/>,LapAWDA的优化函数转变成Step S4.1, initialize weights , solve for the projection matrix/> , the optimization function of LapAWDA is transformed into
, ,
, ,
其中为常数,/>0为权衡系数, />,in is a constant,/> 0 is the trade-off coefficient, /> ,
首先计算矩阵,得到优化目标为First calculate the matrix , the optimization objective is obtained as
, ,
应用拉格朗日乘子法,将优化问题转换成特征分解问题:Apply the Lagrange multiplier method to convert the optimization problem into an eigendecomposition problem:
, ,
其中是对角矩阵,对角元素/>,/>为/>的第/>行向量,/>为特征值,最优的投影矩阵/>是由前/>个最大特征值对应的/>个特征向量组成,其中/>;in is a diagonal matrix with diagonal elements/> ,/> for/> of/> row vector, /> is the eigenvalue, the optimal projection matrix/> by former/> corresponding to the largest eigenvalue/> consists of feature vectors, where/> ;
步骤S4.2、固定投影矩阵,求解权重向量/>,此时LapAWDA的目标函数变为Step S4.2, fixed projection matrix , solve for the weight vector/> , at this time the objective function of LapAWDA becomes
, ,
, ,
根据柯西不等式,得到权重向量的解为According to Cauchy's inequality, the solution of the weight vector is
; ;
步骤S4.3、更新权重向量,根据步骤S4.1继续求解投影矩阵/>;得到本轮最优投影矩阵后,进行下一轮交替迭代求解,即/>,固定投影矩阵/>,根据步骤S4.2更新权重向量/>,重复步骤S4.3直至满足停止条件得到最优投影矩阵/>。Step S4.3, update weight vector , continue to solve the projection matrix according to step S4.1/> ;After obtaining the optimal projection matrix of this round, proceed to the next round of alternating iterative solution, that is/> , fixed projection matrix/> , update the weight vector according to step S4.2/> , repeat step S4.3 until the stopping condition is met to obtain the optimal projection matrix/> .
由于拉普拉斯自适应权重判别分析方法的优化问题并不是一个经典的二次优化问题,所以在求解过程中采用了一种快速有效的迭代优化算法,并可以在理论上证明它是收敛的。Since the optimization problem of the Laplacian adaptive weight discriminant analysis method is not a classic quadratic optimization problem, a fast and effective iterative optimization algorithm is used in the solution process, and it can be theoretically proven that it is convergent. .
进一步地,所述步骤S4.3中停止条件为。Further, the stop condition in step S4.3 is .
进一步地,所述类内散度矩阵计算方式如下:Further, the intra-class divergence matrix The calculation is as follows:
, ,
其中表示第/>类中的第/>个样本,/>表示第/>类的均值向量;in Indicates the first/> No./> in class samples,/> Indicates the first/> The mean vector of the class;
所述类间散度矩阵计算方式如下:The inter-class divergence matrix The calculation is as follows:
。 .
本发明还提供一种计算机存储介质,其上存储有计算机程序,所述计算机该程序被处理器执行时,实现上述手写体数字识别方法。The present invention also provides a computer storage medium on which a computer program is stored. When the computer program is executed by a processor, the above handwritten digit recognition method is implemented.
与现有技术相比,本发明所提供的技术方案的优点在于:Compared with the existing technology, the advantages of the technical solution provided by the present invention are:
本发明通过流形正则项引入无标签数据的结构信息,同时采用一个自适应权重的方法均衡每个类对间的KL散度,避免KL散度小的类对在投影空间中消失。此外对于投影矩阵施加了L2,1范数约束,目的是得到一个稀疏判别的投影矩阵,进一步提升分类精度,从而更适用于多分类任务。LapAWDA的优化问题得到的最优解是可以提取到有用信息有利于后续的分类任务。This invention introduces structural information of unlabeled data through manifold regularization terms, and uses an adaptive weight method to balance the KL divergence between each class pair to prevent class pairs with small KL divergence from disappearing in the projection space. In addition, the L 2,1 norm constraint is imposed on the projection matrix in order to obtain a sparsely discriminative projection matrix to further improve the classification accuracy and thus be more suitable for multi-classification tasks. The optimal solution to the optimization problem of LapAWDA can extract useful information that is beneficial to subsequent classification tasks.
通过判别分析方法和流形正则项的结合,本发明的半监督特征提取方法与最近邻分类结合得到多分类模型,可以用于解决标签数据少的半监督多分类问题,提高了数据的利用率,并提升了分类性能。Through the combination of the discriminant analysis method and the manifold regularization term, the semi-supervised feature extraction method of the present invention is combined with the nearest neighbor classification to obtain a multi-classification model, which can be used to solve the semi-supervised multi-classification problem with less label data and improve the utilization of data. , and improve the classification performance.
附图说明Description of drawings
图1为手写体数字识别方法的流程示意图。Figure 1 is a schematic flow chart of the handwritten digit recognition method.
图2为运用拉普拉斯自适应权重判别分析方法的流程示意图。Figure 2 is a schematic flow chart of using the Laplacian adaptive weight discriminant analysis method.
图3为MNIST数据集的样本。Figure 3 shows a sample of the MNIST data set.
图4为在MNIST数据集上10种方法在10个标签数据时随维度变化的平均准确率。Figure 4 shows the average accuracy of 10 methods on the MNIST data set with 10 label data as the dimension changes.
图5为在MNIST数据集上10种方法在20个标签数据时随维度变化的平均准确率。Figure 5 shows the average accuracy of 10 methods on the MNIST data set with 20 label data as the dimension changes.
图6为在MNIST数据集上10种方法在30个标签数据时随维度变化的平均准确率。Figure 6 shows the average accuracy of 10 methods on the MNIST data set with 30 label data as the dimension changes.
具体实施方式Detailed ways
下面结合实施例对本发明作进一步说明,应理解这些实施例仅用于说明本发明而不用于限制本发明的范围,在阅读了本说明之后,本领域技术人员对本说明的各种等同形式的修改均落于本申请所附权利要求所限定的范围内。The present invention will be further described below in conjunction with the examples. It should be understood that these examples are only used to illustrate the present invention and are not intended to limit the scope of the present invention. After reading this description, those skilled in the art will make various equivalent modifications to this description. All fall within the scope defined by the appended claims of this application.
本实施例涉及的手写体数字识别装置,包括:The handwritten digit recognition device involved in this embodiment includes:
预处理模块:收集到的样本进行归一化处理得到训练数据,所述训练数据包括标签数据和无标签数据;Preprocessing module: The collected samples are normalized to obtain training data. The training data includes labeled data and unlabeled data;
第一计算模块:由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;The first calculation module: calculates the intra-class divergence matrix and the inter-class divergence matrix from the label data in the training data;
第二计算模块:由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;The second calculation module: constructs a neighbor graph from the labeled data and unlabeled data in the training data to calculate the manifold regularization term;
最优投影矩阵求解模块:利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵;The optimal projection matrix solution module: uses training data to learn the optimal projection matrix through the Laplacian adaptive weight discriminant analysis method;
识别模块:将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。Recognition module: Normalize the sample to be identified, obtain the projected data through the optimal projection matrix, and then use the nearest neighbor classifier to obtain the identification label.
具体的,请结合图1及图2所示,该装置采用的手写体数字识别方法,包括以下步骤:Specifically, please refer to Figure 1 and Figure 2. The handwritten digit recognition method used by the device includes the following steps:
步骤S1、收集到的样本进行归一化处理得到训练数据,归一化处理的过程是将图像的每个像素值除以255映射到[0,1]的范围。训练数据包括标签数据和无标签数据/>,训练数据即为/>,其中标签数据的标签向量为/>,标签信息/>,/>为类别数,标签数据个数为/>,无标签数据个数为/>,训练数据总数为/>。Step S1: The collected samples are normalized to obtain training data. The normalization process is to divide each pixel value of the image by 255 and map it to the range of [0,1]. Training data includes labeled data and unlabeled data/> , the training data is/> , where the label vector of label data is/> , label information/> ,/> is the number of categories, and the number of label data is/> , the number of unlabeled data is/> , the total number of training data is/> .
步骤S2、由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;Step S2: Calculate the intra-class divergence matrix and the inter-class divergence matrix from the label data in the training data;
对于类数据,总的类内散度矩阵/>计算方式为:,for Class data, total within-class divergence matrix/> The calculation method is: ,
其中表示第/>类中的第/>个样本,/>表示第/>类的均值向量。in Indicates the first/> No./> in class samples,/> Indicates the first/> The mean vector of the class.
而对于类中任意两类之间组成/>个类对得到/>个类间散度矩阵,即第/>类和第/>类的类间散度矩阵/>的计算方式为:And for Composed between any two categories in the class/> Each class pair is obtained/> Inter-class divergence matrix, that is, the /> Class and No./> Inter-class divergence matrix of classes/> The calculation method is:
。 .
步骤S3、由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;具体包括:Step S3: Construct a neighbor graph from the labeled data and unlabeled data in the training data to calculate the manifold regular term; specifically including:
步骤S3.1、利用训练数据构造近邻图得到近邻矩阵,近邻矩阵/>的构造方式如下:Step S3.1, using training data Construct the nearest neighbor graph to obtain the nearest neighbor matrix ,nearest neighbor matrix/> is constructed as follows:
, ,
其中表示为/>的/>近邻集合。in Expressed as/> of/> Neighbor collection.
步骤S3.2、计算流形正则项中的拉普拉斯矩阵,其中/>为对角矩阵,对角元素/>,在投影空间中的得到流形正则项为Step S3.2, Calculate the Laplacian matrix in the manifold regularization term , of which/> is a diagonal matrix, diagonal elements/> , the obtained manifold regular term in the projection space is
, ,
其中为L2范数,/>表示为/>在低维投影空间中的像,/>,d为投影空间的维度。in is the L 2 norm,/> Expressed as/> Image in low-dimensional projection space,/> , d is the dimension of the projection space.
步骤S4、从流形正则项的表达式可以看出是与投影向量相关,所以将流形正则项引入拉普拉斯自适应权重判别分析方法(LapAWDA),其需要在优化过程中求解投影向量。由于权重向量/>不是预定义的,而是从低维投影空间中学习得到,而且LapAWDA的优化目标表明它是非光滑的,无法直接同时求解投影向量/>和权重向量/>,但采用迭代优化的算法可以得到近似最优解,所以LapAWDA优化问题采用1)固定权重向量/>,更新投影向量/>;2)固定投影向量/>,更新权重向量/>这两步,交替迭代求解,直至满足停止条件得到近似最优解。Step S4. From the expression of the manifold regular term, it can be seen that it is related to the projection vector Relevant, so the manifold regularization term is introduced into the Laplacian adaptive weight discriminant analysis method (LapAWDA), which requires solving the projection vector during the optimization process . Since the weight vector/> It is not predefined, but learned from a low-dimensional projection space, and the optimization goal of LapAWDA shows that it is non-smooth and cannot directly solve the projection vector simultaneously/> and weight vector/> , but an approximately optimal solution can be obtained using an iterative optimization algorithm, so the LapAWDA optimization problem uses 1) fixed weight vector/> , update the projection vector/> ;2) Fixed projection vector/> , update the weight vector/> These two steps are solved iteratively and alternately until the stopping condition is met and an approximately optimal solution is obtained.
具体来说,将拉普拉斯自适应权重判别分析方法的优化目标设置为Specifically, the optimization objective of the Laplacian adaptive weight discriminant analysis method is set to
, ,
, ,
其中为类内散度矩阵,/>为类间散度矩阵,/>为投影矩阵/>的L2,1范数,m为特征个数,/>,/>为权衡参数,/>为单位矩阵。in is the within-class divergence matrix,/> is the inter-class divergence matrix,/> is the projection matrix/> L 2,1 norm, m is the number of features,/> ,/> To weigh parameters,/> is the identity matrix.
迭代优化步骤如下:The iterative optimization steps are as follows:
步骤S4.1、初始化权重,求解投影矩阵/>,LapAWDA的优化函数转变成Step S4.1, initialize weights , solve for the projection matrix/> , the optimization function of LapAWDA is transformed into
, ,
, ,
其中为常数,/>0为权衡系数, />,in is a constant,/> 0 is the trade-off coefficient, /> ,
首先计算矩阵,得到优化目标为First calculate the matrix , the optimization objective is obtained as
, ,
应用拉格朗日乘子法,将优化问题转换成特征分解问题:Apply the Lagrange multiplier method to convert the optimization problem into an eigendecomposition problem:
, ,
其中是对角矩阵,对角元素/>,/>为/>的第/>行向量,/>为特征值,最优的投影矩阵/>是由前/>个最大特征值对应的/>个特征向量组成,其中/>;in is a diagonal matrix with diagonal elements/> ,/> for/> of/> row vector, /> is the eigenvalue, the optimal projection matrix/> by former/> corresponding to the largest eigenvalue/> consists of feature vectors, where/> ;
步骤S4.2、固定投影矩阵,求解权重向量/>,此时LapAWDA的目标函数变为Step S4.2, fixed projection matrix , solve for the weight vector/> , at this time the objective function of LapAWDA becomes
, ,
, ,
根据柯西不等式,得到权重向量的解为According to Cauchy's inequality, the solution of the weight vector is
; ;
步骤S4.3、更新权重向量,根据步骤S4.1继续求解投影矩阵/>;得到本轮最优投影矩阵后,进行下一轮交替迭代求解,即/>,固定投影矩阵/>,根据步骤S4.2更新权重向量/>,重复步骤S4.3直至满足停止条件/>,得到最优投影矩阵/>。Step S4.3, update weight vector , continue to solve the projection matrix according to step S4.1/> ;After obtaining the optimal projection matrix of this round, proceed to the next round of alternating iterative solution, that is/> , fixed projection matrix/> , update the weight vector according to step S4.2/> , repeat step S4.3 until the stop condition is met/> , get the optimal projection matrix/> .
步骤S5、将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,对于识别样本归一化后的数据投影后的像为/>,然后采用最近邻分类器得到识别标签。Step S5: Normalize the sample to be identified, and then obtain the projected data through the optimal projection matrix. For the normalized data of the identified sample The projected image is/> , and then use the nearest neighbor classifier to obtain the identification label.
本发明的论证实验使用数据集是:MNIST手写体数字图像。The data set used in the demonstration experiment of this invention is: MNIST handwritten digital images.
MNIST数据集是机器学习领域中一个经典数据集,由60000个训练样本和10000个测试样本组成,每个样本都是一张28 * 28像素的灰度手写数字图片,如图3所示。实验中训练集由0~9手写体数字的每类训练集中随机抽取的100张图像组成,总共10个类别,测试集为10000个测试样本。为验证本发明在半监督学习中有效性,实验中采用不同数量的标签数据进行训练,取测试集上的10次试验准确率的平均值作为评价指标。本发明是一种特征提取的方法,为了展示其在MNIST数据集上的分类表现,采用的最近邻分类器。The MNIST data set is a classic data set in the field of machine learning. It consists of 60,000 training samples and 10,000 test samples. Each sample is a 28 * 28 pixel grayscale handwritten digit picture, as shown in Figure 3. In the experiment, the training set consists of 100 randomly selected images from each category of handwritten digits from 0 to 9, with a total of 10 categories, and the test set is 10,000 test samples. In order to verify the effectiveness of the present invention in semi-supervised learning, different amounts of labeled data were used for training in the experiment, and the average accuracy rate of 10 trials on the test set was taken as the evaluation index. The present invention is a feature extraction method. In order to demonstrate its classification performance on the MNIST data set, the nearest neighbor classifier is used.
实验硬件环境:Intel Core i5 (2.7GHz)处理器和8GB内存的Macbook Pro。代码运行环境:Matlab(R2015b)。实验结果如下:Experimental hardware environment: Macbook Pro with Intel Core i5 (2.7GHz) processor and 8GB memory. Code running environment: Matlab (R2015b). The experimental results are as follows:
为了验证本发明的有效性和优越性,实验对比了5个有监督的判别分析方法(LDA、LFDA、aPAC、LADA和MDAAWS)和4个半监督的判别分析方法(SLDA、SMMC、SSDR和SELF),在这里,近邻数设为5,每个方法中的正则项参数均在参数范围内通过网格搜索得到的。表1记录了不同标签个数下本发明与其它9种对比方法的分类准确率,这里的特征维度为20。从表中,可以看出半监督的判别分析方法的分类准确率一般要比对应的有监督方法高,说明无标签数据提供了有利信息,而且随着标签样本数的增加,部分有监督的判别分析方法因为过拟合而导致分类性能下降,但是半监督的判别分析方法都没有遇到这个现象。所以引入无标签数据的信息可以提高算法的泛化能力,本发明提出的LapAWDA方法无论是在10个、20个还是30个标签数据的训练数据上,获得的分类准确率是最高的,明显优于其它的判别分析方法。In order to verify the effectiveness and superiority of the present invention, the experiment compared 5 supervised discriminant analysis methods (LDA, LFDA, aPAC, LADA and MDAAWS) and 4 semi-supervised discriminant analysis methods (SLDA, SMMC, SSDR and SELF ), here, the number of nearest neighbors is set to 5, and the regularization parameters in each method are within the parameter range obtained through grid search. Table 1 records the classification accuracy of the present invention and 9 other comparison methods under different number of labels. The feature dimension here is 20. From the table, it can be seen that the classification accuracy of the semi-supervised discriminant analysis method is generally higher than that of the corresponding supervised method, indicating that unlabeled data provides beneficial information, and as the number of labeled samples increases, the partial supervised discriminant Analysis methods reduce classification performance due to overfitting, but semi-supervised discriminant analysis methods have not encountered this phenomenon. Therefore, introducing the information of unlabeled data can improve the generalization ability of the algorithm. The LapAWDA method proposed by the present invention has the highest classification accuracy regardless of the training data of 10, 20 or 30 labeled data, which is obviously superior. to other discriminant analysis methods.
表1 不同标签个数下10种方法的分类平均准确率(%)Table 1 Average classification accuracy (%) of 10 methods under different number of labels
为了研究特征个数和标记样本数量对LapAWDA获得的投影矩阵的影响,从每类训练数据种分别随机选择10、20和30个标记样本,其余训练数据视为未标记样本。图4至图6展示了多个判别分类方法在MNIST数据集上维度从5到50变化的准确率,本发明在每个维度上都取得了最高准确率,特别是维度取前20时,本发明的分类性能要远远优于其它方法。而且随着标记样本的增加,本发明的分类准确也在上升。以上结果都表明了本发明在分类任务中通过投影矩阵能从训练数据中获得更多的判别信息,同时提高了标签数据的利用率。In order to study the impact of the number of features and the number of labeled samples on the projection matrix obtained by LapAWDA, 10, 20 and 30 labeled samples were randomly selected from each type of training data, and the remaining training data were regarded as unlabeled samples. Figures 4 to 6 show the accuracy of multiple discriminant classification methods on the MNIST data set when the dimensions change from 5 to 50. The present invention has achieved the highest accuracy in each dimension, especially when the first 20 dimensions are taken. The classification performance of the invention is far better than other methods. Moreover, as the number of labeled samples increases, the classification accuracy of the present invention also increases. The above results show that the present invention can obtain more discriminative information from training data through the projection matrix in classification tasks, and at the same time improve the utilization of label data.
应当指出的是,上述实施例的具体方法可形成计算机程序产品,因此,本申请实施的计算机程序产品可存储在在一个或多个计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上。It should be noted that the specific methods of the above embodiments can form a computer program product. Therefore, the computer program product implemented in the present application can be stored in one or more computer-usable storage media (including but not limited to disk memory, CD-ROM, optical storage, etc.).
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410130100.XA CN117671704B (en) | 2024-01-31 | 2024-01-31 | A method, device and computer storage medium for handwritten digit recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410130100.XA CN117671704B (en) | 2024-01-31 | 2024-01-31 | A method, device and computer storage medium for handwritten digit recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117671704A true CN117671704A (en) | 2024-03-08 |
CN117671704B CN117671704B (en) | 2024-04-26 |
Family
ID=90079208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410130100.XA Active CN117671704B (en) | 2024-01-31 | 2024-01-31 | A method, device and computer storage medium for handwritten digit recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117671704B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118097396A (en) * | 2024-04-23 | 2024-05-28 | 南京信息工程大学 | Underwater optical image recognition method and device |
WO2025082009A1 (en) * | 2024-07-15 | 2025-04-24 | 长春大学 | Handwritten numeral classification method and apparatus, and medium and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104992166A (en) * | 2015-07-28 | 2015-10-21 | 苏州大学 | Robust measurement based handwriting recognition method and system |
CN106845358A (en) * | 2016-12-26 | 2017-06-13 | 苏州大学 | A kind of method and system of handwritten character characteristics of image identification |
CN109376796A (en) * | 2018-11-19 | 2019-02-22 | 中山大学 | Image classification method based on active semi-supervised learning |
CN112861929A (en) * | 2021-01-20 | 2021-05-28 | 河南科技大学 | Image classification method based on semi-supervised weighted migration discriminant analysis |
-
2024
- 2024-01-31 CN CN202410130100.XA patent/CN117671704B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104992166A (en) * | 2015-07-28 | 2015-10-21 | 苏州大学 | Robust measurement based handwriting recognition method and system |
CN106845358A (en) * | 2016-12-26 | 2017-06-13 | 苏州大学 | A kind of method and system of handwritten character characteristics of image identification |
CN109376796A (en) * | 2018-11-19 | 2019-02-22 | 中山大学 | Image classification method based on active semi-supervised learning |
CN112861929A (en) * | 2021-01-20 | 2021-05-28 | 河南科技大学 | Image classification method based on semi-supervised weighted migration discriminant analysis |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118097396A (en) * | 2024-04-23 | 2024-05-28 | 南京信息工程大学 | Underwater optical image recognition method and device |
WO2025082009A1 (en) * | 2024-07-15 | 2025-04-24 | 长春大学 | Handwritten numeral classification method and apparatus, and medium and device |
Also Published As
Publication number | Publication date |
---|---|
CN117671704B (en) | 2024-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kola et al. | A novel approach for facial expression recognition using local binary pattern with adaptive window | |
Prince et al. | Probabilistic models for inference about identity | |
Feng et al. | Adaptive unsupervised multi-view feature selection for visual concept recognition | |
CN105760821B (en) | The face identification method of the grouped accumulation rarefaction representation based on nuclear space | |
Wang et al. | Semi-supervised classification using linear neighborhood propagation | |
CN104794489B (en) | An inductive image classification method and system based on depth label prediction | |
CN117671704B (en) | A method, device and computer storage medium for handwritten digit recognition | |
Shen et al. | {\cal U} Boost: Boosting with the Universum | |
CN102156871B (en) | Image classification method based on category correlated codebook and classifier voting strategy | |
CN108256486B (en) | Image identification method and device based on nonnegative low-rank and semi-supervised learning | |
CN104636732B (en) | A kind of pedestrian recognition method based on the deep belief network of sequence | |
CN106663184A (en) | Method and system for verifying facial data | |
CN108664986B (en) | Multi-task learning image classification method and system based on lp norm regularization | |
CN106557782A (en) | Hyperspectral image classification method and device based on class dictionary | |
US20210182686A1 (en) | Cross-batch memory for embedding learning | |
CN111259784A (en) | SAR image change detection method based on transfer learning and active learning | |
CN116188900A (en) | Small sample image classification method based on global and local feature augmentation | |
CN107145841A (en) | A matrix-based low-rank sparse face recognition method and system | |
CN114357200A (en) | A Cross-modal Hash Retrieval Method Based on Supervised Graph Embedding | |
Ahlawat et al. | A genetic algorithm based feature selection for handwritten digit recognition | |
CN111178254A (en) | Signature identification method and device | |
Liu et al. | Online unsupervised feature learning for visual tracking | |
Aggarwal et al. | Object detection based approaches in image classification: a brief overview | |
CN108108769A (en) | Data classification method and device and storage medium | |
Boudraa et al. | Combination of local features and deep learning to historical manuscripts dating |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |