CN111242204A - A fault feature extraction method for operation and maintenance management and control platform - Google Patents

A fault feature extraction method for operation and maintenance management and control platform Download PDF

Info

Publication number
CN111242204A
CN111242204A CN202010015277.7A CN202010015277A CN111242204A CN 111242204 A CN111242204 A CN 111242204A CN 202010015277 A CN202010015277 A CN 202010015277A CN 111242204 A CN111242204 A CN 111242204A
Authority
CN
China
Prior art keywords
feature
correlation
attributes
features
feature extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010015277.7A
Other languages
Chinese (zh)
Inventor
姜涛
曹杰
王蕾
薄小永
曲朝阳
薛凯
于建友
吕洪波
胡可为
徐鹏程
于成立
周玉光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taipingwan Power Station State Grid Northeast Branch Department Lyuyuan Hydroelectric Co
State Grid Jilin Electric Power Corp
Northeast Electric Power University
Information and Telecommunication Branch of State Grid Eastern Inner Mogolia Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Jilin Electric Power Co Ltd
Original Assignee
Taipingwan Power Station State Grid Northeast Branch Department Lyuyuan Hydroelectric Co
Northeast Dianli University
State Grid Jilin Electric Power Corp
Information and Telecommunication Branch of State Grid Eastern Inner Mogolia Electric Power Co Ltd
Information and Telecommunication Branch of State Grid Jilin Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taipingwan Power Station State Grid Northeast Branch Department Lyuyuan Hydroelectric Co, Northeast Dianli University, State Grid Jilin Electric Power Corp, Information and Telecommunication Branch of State Grid Eastern Inner Mogolia Electric Power Co Ltd, Information and Telecommunication Branch of State Grid Jilin Electric Power Co Ltd filed Critical Taipingwan Power Station State Grid Northeast Branch Department Lyuyuan Hydroelectric Co
Priority to CN202010015277.7A priority Critical patent/CN111242204A/en
Publication of CN111242204A publication Critical patent/CN111242204A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种运维管控平台故障特征提取方法,其特点是,包括:主成分分析特征提取和二次特征选择等内容。基于主成分分析特征提取能将高维空间样本变换为低维空间样本,在特征维度降低的同时降低特征属性的冗余度,并保留了主要的分类信息,大大降低了分类器的计算复杂度,缩短了训练时间;又因为在此特征提取过程中嵌入二次特征选择功能,基于关联规则特征选择,结合启发式序列后向搜索策略对评估结果进行排序,进而确定特征子集的关键特征,使特征属性具有最大关联—最小冗余,即可以最大程度提高属性特征与类属性关联度,且降低属性与属性之间的冗余,显著提高管控故障分类精度。该方法科学合理,适用性强,可广泛适用于各种故障分类管控平台。

Figure 202010015277

A fault feature extraction method for an operation and maintenance management and control platform is characterized in that it includes: principal component analysis feature extraction, secondary feature selection, and the like. Feature extraction based on principal component analysis can transform high-dimensional space samples into low-dimensional space samples, reduce the redundancy of feature attributes while reducing the feature dimension, and retain the main classification information, which greatly reduces the computational complexity of the classifier , shortens the training time; and because the secondary feature selection function is embedded in the feature extraction process, the evaluation results are sorted based on the association rule feature selection, combined with the heuristic sequence backward search strategy, and then the key features of the feature subset are determined. Making the feature attributes have the maximum correlation-minimum redundancy, that is, the correlation between attribute features and class attributes can be maximized, and the redundancy between attributes can be reduced, and the classification accuracy of management and control faults can be significantly improved. The method is scientific and reasonable, has strong applicability, and can be widely applied to various fault classification management and control platforms.

Figure 202010015277

Description

一种运维管控平台故障特征提取方法A fault feature extraction method for operation and maintenance management and control platform

技术领域technical field

本发明涉及信息系统运维管控故障特征提取技术领域,是一种运维管控平台故障特征提取方法。The invention relates to the technical field of information system operation and maintenance management and control fault feature extraction, and is a fault feature extraction method for an operation and maintenance management and control platform.

背景技术Background technique

信息系统管控平台为了获取系统运行状况和运行趋势等信息,对硬件设备、软件应用进行实时远程的监控。管控平台对设备监控需要在网络环境下进行,在网络环境下,数据传输通常会为数据流带来对应的特征,这些特征是实现数据识别的重要基础。当管控设备进行监控时,会收集到大量的故障信息,特征提取与选择技术是对这些进行故障信息进行分类识别的基础。特征提取与选择技术可以实现多属性、高冗余的信息环境下关键监控特征的选取。In order to obtain information such as system operation status and operation trend, the information system management and control platform conducts real-time remote monitoring of hardware equipment and software applications. The monitoring of equipment by the management and control platform needs to be carried out in a network environment. In a network environment, data transmission usually brings corresponding characteristics to the data flow, and these characteristics are an important basis for realizing data identification. When the management and control equipment is monitored, a large amount of fault information will be collected, and feature extraction and selection technology is the basis for classifying and identifying the fault information. Feature extraction and selection technology can realize the selection of key monitoring features in a multi-attribute and highly redundant information environment.

在信息系统智能管控平台中,为加强系统的集中管理和统一监控,通过实现网络、安全设备的全网监控,提供精准的故障判断及处理建议,提高人员解决故障的能力及效率。为了实现这一目标,用特征提取与选择技术确定监控故障数据的关键特征,每个故障类型可能包含许多个特征,从中选取最能代表此类故障类型的关键特征。特征提取与选择技术的优势在于进行故障类型识别分类的过程中,在降低数据冗余前提下,大大提高故障识别的精确度。相比于其他技术更能准确地选取最能代表此类故障类型的关键特征。In the information system intelligent management and control platform, in order to strengthen the centralized management and unified monitoring of the system, by realizing the whole network monitoring of the network and security equipment, it provides accurate fault judgment and processing suggestions, and improves the ability and efficiency of personnel to solve faults. In order to achieve this goal, feature extraction and selection techniques are used to determine the key features of monitoring fault data. Each fault type may contain many features, and the key features that best represent such fault types are selected. The advantage of feature extraction and selection technology is that in the process of fault type identification and classification, the accuracy of fault identification is greatly improved under the premise of reducing data redundancy. Compared with other techniques, it can more accurately select the key features that best represent this type of failure.

通过特征提取与选择技术,实现对故障类型的有效识别分类,从而对故障进行快速高效的分析和处理,及时向管理人员进行快速报警,实现24小时的无人连续监控。Through feature extraction and selection technology, it can effectively identify and classify fault types, so as to analyze and deal with faults quickly and efficiently.

运维管控平台故障数据中含有较多特征的数据,这些数据称为高维数据。基于高维数据的部分特征对故障类型进行自动分类,但某些故障数据中的特征对分类结果的贡献并不大。此外,由于特征之间存在一定的相关性及冗余,使分类过程中产生较大的时间、空间开销,造成故障分类效果不佳。高维数据的冗余特征在很大程度上影响着分类器的性能,尤其是采用全部数据特征作为决策函数的标准有监督学习分类算法。因而,对于基于有监督学习的分类器,在分类之前先对其原始数据特征进行提取或特征选择,减少数据的冗余性,能够有效提升分类器的泛化能力。目前,管控平台故障分类的故障统计特征可以达到上百种。为了提升分类算法效率与准确率,有效减少原始数据的规模与特征间的冗余,需要对原始高维数据的特征进行特征选择和提取。特征选择是从原始数据特征中选择出一个最优特征子集,这个特征子集能够最大程度上代表原始数据的分布特性;特征提取是通过映射原理,将高维的数据样本通过变换映射为低维样本,映射后形成新的样本特征组合,这种组合不仅维度降低且由于是映射变换亦能够充分代表原始特征。The fault data of the operation and maintenance management and control platform contains data with many characteristics, which is called high-dimensional data. The fault types are automatically classified based on some features of the high-dimensional data, but some features in the fault data do not contribute much to the classification results. In addition, due to certain correlation and redundancy between features, large time and space overheads are incurred in the classification process, resulting in poor fault classification results. Redundant features of high-dimensional data greatly affect the performance of classifiers, especially standard supervised learning classification algorithms that use all data features as decision functions. Therefore, for a classifier based on supervised learning, the features of the original data are extracted or selected before classification, which reduces the redundancy of the data and can effectively improve the generalization ability of the classifier. At present, there are hundreds of fault statistical features for fault classification on the management and control platform. In order to improve the efficiency and accuracy of the classification algorithm and effectively reduce the redundancy between the scale and features of the original data, it is necessary to select and extract the features of the original high-dimensional data. Feature selection is to select an optimal feature subset from the original data features, which can represent the distribution characteristics of the original data to the greatest extent; feature extraction is to map high-dimensional data samples into low-dimensional data samples through transformation through the mapping principle dimensional samples, after mapping, a new sample feature combination is formed. This combination not only reduces the dimension, but also fully represents the original features due to the mapping transformation.

发明内容SUMMARY OF THE INVENTION

本发明的目的是,克服当数据间相似的依赖性很强时,单纯采用特征选择方法冗余信息去除不充分的问题,提供一种科学合理,适用性强,能够在确定特征子集的情况下,更加有效的去除数据冗余,同时取得较好分类精度的运维管控平台故障特征提取方法。The purpose of the present invention is to overcome the problem of insufficient removal of redundant information by simply adopting the feature selection method when the similarity between data is strongly dependent, and to provide a scientifically reasonable and highly applicable method that can determine the feature subset in the situation In this way, the fault feature extraction method of the operation and maintenance management and control platform can more effectively remove data redundancy and achieve better classification accuracy.

本发明的目的是由以下技术方案来实现的:一种运维管控平台故障特征提取方法,其特征是,它包括的内容有:The purpose of the present invention is achieved by the following technical solutions: a method for extracting fault features of an operation and maintenance management and control platform, which is characterized in that the content it includes includes:

1)主成分分析特征提取1) Principal Component Analysis Feature Extraction

主成分分析(Principle Component Analysis,PCA)是进行样本空间变换,通过投影确定所有原始特征向量方差最大的投影方向,将该投影方向定位判别矢量进行特征提取,投影变换后,原始样本变为尽量分散的低维样本,同时保持变换前原高维样本空间的差异性,设定在原始高维空间包含样本N个,X∈Rn,每个样本为一个Xi=[xi1,...,xin]T∈Rn,其矢量均值为M,则对应的特征向量为Xi=[x1i,...,xni]∈Rn,且对应的协方差矩阵为公式(1),Principal Component Analysis (PCA) is to transform the sample space, determine the projection direction with the largest variance of all the original eigenvectors through projection, and perform feature extraction on the locating and discriminating vector of the projection direction. After the projection transformation, the original sample becomes as scattered as possible. while maintaining the difference of the original high-dimensional sample space before transformation, it is assumed that the original high-dimensional space contains N samples, X∈R n , and each sample is a X i =[x i1 ,..., x in ] T ∈R n , its vector mean is M, then the corresponding eigenvector is X i =[x 1i ,...,x ni ]∈R n , and the corresponding covariance matrix is formula (1),

Figure BDA0002358650860000021
Figure BDA0002358650860000021

样本在特征矢量上的分布方差,即公式(1)协方差矩阵的特征值,对公式(1)中的协方差矩阵进行对角化后得到的正交矩阵为公式(2),The distribution variance of the sample on the eigenvector, that is, the eigenvalue of the covariance matrix of formula (1), the orthogonal matrix obtained by diagonalizing the covariance matrix in formula (1) is formula (2),

Figure BDA0002358650860000022
Figure BDA0002358650860000022

将Q表示为

Figure BDA0002358650860000023
其中M为正交矩阵Q的维数,PCA则基于Q推导出矩阵中的特征值λ1≥λ2≥…≥λn,并求出特征值对应的标准正交特征向量v1≥v2≥…≥vn,通过正交矩阵Q的特征值及相应的标准正交特征向量,即得到协方差矩阵S的标准正交特征向量u1,u2,…ud,如公式(3),其中标准正交特征向量u1,u2,…ud对应S的前d个最大非零特征值,Denote Q as
Figure BDA0002358650860000023
Where M is the dimension of the orthogonal matrix Q, PCA derives the eigenvalues λ 1 ≥λ 2 ≥...≥λ n in the matrix based on Q, and obtains the standard orthogonal eigenvectors v 1 ≥ v 2 corresponding to the eigenvalues ≥...≥v n , through the eigenvalues of the orthogonal matrix Q and the corresponding standard orthogonal eigenvectors, the standard orthogonal eigenvectors u 1 , u 2 ,... ud of the covariance matrix S are obtained, as shown in formula (3) , where the standard orthogonal eigenvectors u 1 , u 2 ,… ud correspond to the first d largest non-zero eigenvalues of S,

Figure BDA0002358650860000024
Figure BDA0002358650860000024

设定t=95%,ui>t,则空间样本在前d个轴上的主成分累计贡献率就为原始数据的95%,这样,对于任何样本xi将其映射到降维后的低维样本空间U={u1,u2,…ud},其xi的主分量特征为y=(u1,u2,…ud)Txi,则yi为低维空间中的样本点,通过PCA的空间样本变换,不仅使变换后的样本能够代表主成分的累计贡献率的95%,还使原始空间维度由n降为d,d<<n,因而,大大降低了空间的维度,且起到了特征提取的作用;Set t=95%, u i > t, then the cumulative contribution rate of the principal components of the spatial samples on the first d axes is 95% of the original data, so that for any sample xi , it is mapped to the reduced dimension. The low-dimensional sample space U={u 1 , u 2 ,...u d }, the principal component feature of x i is y=(u 1 , u 2 ,... u d ) T x i , then y i is a low-dimensional space The sample points in , through the spatial sample transformation of PCA, not only the transformed samples can represent 95% of the cumulative contribution rate of the principal components, but also the original spatial dimension is reduced from n to d, d<<n, thus greatly reducing the It has the dimension of space and played the role of feature extraction;

2)二次特征选择2) Quadratic feature selection

PCA特征提取后,为进一步得到最优特征子集及PCA低维空间的关键特征,嵌入二次特征选择算法,该算法基于过滤式(Filter)关联规则特征选择(Correlation-basedFeature Selection,CFS),在对样本特征进行相关性评估时采用启发式序列后向搜索策略,通过对特征的相关性排序确定最优的特征子集,After the PCA feature extraction, in order to further obtain the optimal feature subset and the key features of the PCA low-dimensional space, a quadratic feature selection algorithm is embedded, which is based on the filter association rule feature selection (Correlation-based Feature Selection, CFS), When evaluating the correlation of sample features, a heuristic sequence backward search strategy is used, and the optimal feature subset is determined by sorting the correlation of features.

CFS将特征的相关性作为评估标准,是一种过滤(Filter)式的特征选择算法,在相应的搜索策略下,旨在降低属性与属性之间的冗余,同时提高属性特征与类属性关联度,达到筛选冗余性高的属性及与类别无关的属性,公式(4)是其评估标准,对特征子集S的k个特征的评价用Ms表示,其中特征属性与类的相关度均值为

Figure BDA0002358650860000031
属性间的相关度均值则用
Figure BDA0002358650860000032
表示,由公式(4)可知,由关联规则特征选择算法确定的候选特征子集能够使特征属性具有最大关联—最小冗余,即可以最大程度提高属性特征与类属性关联度,且降低属性与属性之间的冗余,即公式(4)中评价值Ms越高,特征属性与类的相关度均值为
Figure BDA0002358650860000033
越大,属性间的相关度均值则用
Figure BDA0002358650860000034
越小,CFS takes the correlation of features as the evaluation standard, and is a filter-type feature selection algorithm. Under the corresponding search strategy, it aims to reduce the redundancy between attributes and improve the association between attribute features and class attributes. To achieve the screening of attributes with high redundancy and attributes that are not related to categories, formula (4) is its evaluation standard, and the evaluation of k features of feature subset S is represented by M s , where the correlation between feature attributes and classes mean is
Figure BDA0002358650860000031
The mean correlation between attributes is used
Figure BDA0002358650860000032
It can be seen from formula (4) that the candidate feature subset determined by the association rule feature selection algorithm can make the feature attributes have the maximum correlation-minimum redundancy, that is, it can maximize the degree of association between attribute features and class attributes, and reduce the relationship between attributes and attributes. The redundancy between attributes, that is, the higher the evaluation value M s in formula (4), the mean of the correlation between the feature attribute and the class is
Figure BDA0002358650860000033
The larger the value, the mean of the correlation between attributes is used
Figure BDA0002358650860000034
the smaller the

Figure BDA0002358650860000035
Figure BDA0002358650860000035

关联规则特征选择中采用信息增益算法评估各属性间的相关性,而信息增益的计算方法是对称性的测量方法,因此当特征子集S中两个高阶关联的特征存在时,例如特征Wi、Wj,可采用公式(5)的对称不确定方法,特征的熵为H(W),特征关联性为U,由此公式(6)为基于属性间相关性的特征子集的评估函数,当评估值Hs升高时,特征子集S中特征Wj与Wi相关性减小,且与类属性相关性增大,In the feature selection of association rules, the information gain algorithm is used to evaluate the correlation between attributes, and the calculation method of information gain is the measurement method of symmetry, so when two high-order related features in the feature subset S exist, such as feature W i , W j , the symmetric uncertainty method of formula (5) can be used, the entropy of the feature is H(W), and the feature correlation is U, so formula (6) is the evaluation of the feature subset based on the correlation between attributes function, when the evaluation value H s increases , the correlation between the feature W j and Wi in the feature subset S decreases, and the correlation with the class attribute increases,

Figure BDA0002358650860000036
Figure BDA0002358650860000036

Figure BDA0002358650860000037
Figure BDA0002358650860000037

采用CFS算法,在PCA中嵌入二次特征选择功能,然后基于启发式序列后向搜索策略,计算CFS的评估结果,经过排序后筛选出最优特征子集。Using the CFS algorithm, the secondary feature selection function is embedded in the PCA, and then based on the heuristic sequence backward search strategy, the evaluation results of the CFS are calculated, and the optimal feature subsets are filtered out after sorting.

本发明的一种运维管控平台故障特征提取方法是一种嵌入二次特征选择功能的特征提取方法,因为基于PCA特征提取,将高维空间样本变换为低维空间样本,在特征维度降低的同时降低了特征属性的冗余度,并保留了主要的分类信息,大大降低了分类器的计算复杂度,缩短了训练时间;又因为在此特征提取过程中嵌入二次特征选择功能,基于CFS结合启发式序列后向搜索策略对评估结果进行排序,进而确定特征子集的关键特征,使特征属性具有最大关联—最小冗余,即可以最大程度提高属性特征与类属性关联度,且降低属性与属性之间的冗余,显著提高管控故障分类精度。该方法科学合理,适用性强,可广泛适用于各种故障分类管控平台。A fault feature extraction method for an operation and maintenance management and control platform of the present invention is a feature extraction method embedded with a secondary feature selection function, because based on PCA feature extraction, the high-dimensional space samples are transformed into low-dimensional space samples, and in the case of reduced feature dimensions At the same time, the redundancy of feature attributes is reduced, and the main classification information is retained, which greatly reduces the computational complexity of the classifier and shortens the training time; and because the secondary feature selection function is embedded in the feature extraction process, based on CFS Combine the heuristic sequence backward search strategy to sort the evaluation results, and then determine the key features of the feature subset, so that the feature attributes have the maximum correlation-minimum redundancy, that is, the correlation between attribute features and class attributes can be maximized, and the attributes can be reduced. The redundancy between attributes significantly improves the classification accuracy of management and control faults. The method is scientific and reasonable, has strong applicability, and can be widely applied to various fault classification management and control platforms.

附图说明Description of drawings

图1为本发明的一种运维管控平台故障特征提取方法功能示意图;1 is a functional schematic diagram of a method for extracting fault features of an operation and maintenance management and control platform of the present invention;

图2为嵌入二次特征选择功能的特征后向搜索策略流程图;Fig. 2 is a flow chart of a feature backward search strategy embedded with a secondary feature selection function;

图3为基于初次PCA特征提取前后故障分类性能对比图;Figure 3 is a comparison chart of fault classification performance before and after feature extraction based on the initial PCA;

图4为嵌入二次特征选择功能的PCA特征提取方法与传统特征提取方法性能对比图。Figure 4 is a performance comparison diagram of the PCA feature extraction method embedded with the secondary feature selection function and the traditional feature extraction method.

具体实施方式Detailed ways

下面利用附图和具体实施方式对本发明作进一步说明。The present invention will be further described below with reference to the accompanying drawings and specific embodiments.

本发明的一种运维管控平台故障特征提取方法,包括的内容有:A method for extracting fault features of an operation and maintenance management and control platform of the present invention includes the following contents:

1)主成分分析特征提取1) Principal Component Analysis Feature Extraction

主成分分析(Principle Component Analysis,PCA)是进行样本空间变换,通过投影确定所有原始特征向量方差最大的投影方向,将该投影方向定位判别矢量进行特征提取,投影变换后,原始样本变为尽量分散的低维样本,同时保持变换前原高维样本空间的差异性,设定在原始高维空间包含样本N个,X∈Rn,每个样本为一个Xi=[xi1,...,xin]T∈Rn,其矢量均值为M,则对应的特征向量为Xi=[x1i,...,xni]∈Rn,且对应的协方差矩阵为公式(1),Principal Component Analysis (PCA) is to transform the sample space, determine the projection direction with the largest variance of all the original eigenvectors through projection, and perform feature extraction on the locating and discriminating vector of the projection direction. After the projection transformation, the original sample becomes as scattered as possible. while maintaining the difference of the original high-dimensional sample space before transformation, it is assumed that the original high-dimensional space contains N samples, X∈R n , and each sample is a X i =[x i1 ,..., x in ] T ∈R n , its vector mean is M, then the corresponding eigenvector is X i =[x 1i ,...,x ni ]∈R n , and the corresponding covariance matrix is formula (1),

Figure BDA0002358650860000041
Figure BDA0002358650860000041

样本在特征矢量上的分布方差,即公式(1)协方差矩阵的特征值,对公式(1)中的协方差矩阵进行对角化后得到的正交矩阵为公式(2),The distribution variance of the sample on the eigenvector, that is, the eigenvalue of the covariance matrix of formula (1), the orthogonal matrix obtained by diagonalizing the covariance matrix in formula (1) is formula (2),

Figure BDA0002358650860000042
Figure BDA0002358650860000042

将Q表示为

Figure BDA0002358650860000043
其中M为正交矩阵Q的维数,PCA则基于Q推导出矩阵中的特征值λ1≥λ2≥…≥λn,并求出特征值对应的标准正交特征向量v1≥v2≥…≥vn,通过正交矩阵Q的特征值及相应的标准正交特征向量,即得到协方差矩阵S的标准正交特征向量u1,u2,…ud,如公式(3),其中标准正交特征向量u1,u2,…ud对应S的前d个最大非零特征值,Denote Q as
Figure BDA0002358650860000043
Where M is the dimension of the orthogonal matrix Q, PCA derives the eigenvalues λ 1 ≥λ 2 ≥...≥λ n in the matrix based on Q, and obtains the standard orthogonal eigenvectors v 1 ≥ v 2 corresponding to the eigenvalues ≥...≥v n , through the eigenvalues of the orthogonal matrix Q and the corresponding standard orthogonal eigenvectors, the standard orthogonal eigenvectors u 1 , u 2 ,... ud of the covariance matrix S are obtained, as shown in formula (3) , where the standard orthogonal eigenvectors u 1 , u 2 ,… ud correspond to the first d largest non-zero eigenvalues of S,

Figure BDA0002358650860000051
Figure BDA0002358650860000051

设定t=95%,ui>t,则空间样本在前d个轴上的主成分累计贡献率就为原始数据的95%,这样,对于任何样本xi将其映射到降维后的低维样本空间U={u1,u2,…ud},其xi的主分量特征为y=(u1,u2,…ud)Txi,则yi为低维空间中的样本点,通过PCA的空间样本变换,不仅使变换后的样本能够代表主成分的累计贡献率的95%,还使原始空间维度由n降为d,d<<n,因而,大大降低了空间的维度,且起到了特征提取的作用;Set t=95%, u i > t, then the cumulative contribution rate of the principal components of the spatial samples on the first d axes is 95% of the original data, so that for any sample xi , it is mapped to the reduced dimension. The low-dimensional sample space U={u 1 , u 2 ,...u d }, the principal component feature of x i is y=(u 1 , u 2 ,... u d ) T x i , then y i is a low-dimensional space The sample points in , through the spatial sample transformation of PCA, not only the transformed samples can represent 95% of the cumulative contribution rate of the principal components, but also the original spatial dimension is reduced from n to d, d<<n, thus greatly reducing the It has the dimension of space and played the role of feature extraction;

2.二次特征选择2. Quadratic Feature Selection

PCA特征提取后,为进一步得到最优特征子集及PCA低维空间的关键特征,嵌入二次特征选择算法,该算法基于过滤式(Filter)关联规则特征选择(Correlation-basedFeature Selection,CFS),在对样本特征进行相关性评估时采用启发式序列后向搜索策略,通过对特征的相关性排序确定最优的特征子集,After the PCA feature extraction, in order to further obtain the optimal feature subset and the key features of the PCA low-dimensional space, a quadratic feature selection algorithm is embedded, which is based on the filter association rule feature selection (Correlation-based Feature Selection, CFS), When evaluating the correlation of sample features, a heuristic sequence backward search strategy is used, and the optimal feature subset is determined by sorting the correlation of features.

CFS将特征的相关性作为评估标准,是一种过滤(Filter)式的特征选择算法,在相应的搜索策略下,旨在降低属性与属性之间的冗余,同时提高属性特征与类属性关联度,达到筛选冗余性高的属性及与类别无关的属性,公式(4)是其评估标准,对特征子集S的k个特征的评价用Ms表示,其中特征属性与类的相关度均值为

Figure BDA0002358650860000052
属性间的相关度均值则用
Figure BDA0002358650860000053
表示,由公式(4)可知,由关联规则特征选择算法确定的候选特征子集能够使特征属性具有最大关联—最小冗余,即可以最大程度提高属性特征与类属性关联度,且降低属性与属性之间的冗余,即公式(4)中评价值Ms越高,特征属性与类的相关度均值为
Figure BDA0002358650860000054
越大,属性间的相关度均值则用
Figure BDA0002358650860000055
越小,CFS takes the correlation of features as the evaluation standard, and is a filter-type feature selection algorithm. Under the corresponding search strategy, it aims to reduce the redundancy between attributes and improve the association between attribute features and class attributes. To achieve the screening of attributes with high redundancy and attributes that are not related to categories, formula (4) is its evaluation standard, and the evaluation of k features of feature subset S is represented by M s , where the correlation between feature attributes and classes mean is
Figure BDA0002358650860000052
The mean correlation between attributes is used
Figure BDA0002358650860000053
It can be seen from formula (4) that the candidate feature subset determined by the association rule feature selection algorithm can make the feature attributes have the maximum correlation-minimum redundancy, that is, it can maximize the degree of association between attribute features and class attributes, and reduce the relationship between attributes and attributes. The redundancy between attributes, that is, the higher the evaluation value M s in formula (4), the mean of the correlation between the feature attribute and the class is
Figure BDA0002358650860000054
The larger the value, the mean of the correlation between attributes is used
Figure BDA0002358650860000055
the smaller the

Figure BDA0002358650860000056
Figure BDA0002358650860000056

关联规则特征选择中采用信息增益算法评估各属性间的相关性,而信息增益的计算方法是对称性的测量方法,因此当特征子集S中两个高阶关联的特征存在时,例如特征Wi、Wj,可采用公式(5)的对称不确定方法,特征关联性为U,特征的熵为H(W),由此公式(6)为基于属性间相关性的特征子集的评估函数,当评估值Hs升高时,特征子集S中特征Wj与Wi相关性减小,且与类属性相关性增大,In the feature selection of association rules, the information gain algorithm is used to evaluate the correlation between attributes, and the calculation method of information gain is the measurement method of symmetry, so when two high-order related features in the feature subset S exist, such as feature W i , W j , the symmetric uncertainty method of formula (5) can be used, the feature correlation is U, and the feature entropy is H(W), so formula (6) is the evaluation of feature subset based on the correlation between attributes function, when the evaluation value H s increases , the correlation between the feature W j and Wi in the feature subset S decreases, and the correlation with the class attribute increases,

Figure BDA0002358650860000061
Figure BDA0002358650860000061

Figure BDA0002358650860000062
Figure BDA0002358650860000062

采用CFS算法,在PCA中嵌入二次特征选择功能,然后基于启发式序列后向搜索策略,计算CFS的评估结果,经过排序后筛选出最优特征子集。Using the CFS algorithm, the secondary feature selection function is embedded in the PCA, and then based on the heuristic sequence backward search strategy, the evaluation results of the CFS are calculated, and the optimal feature subsets are filtered out after sorting.

参照图1,本发明的一种运维管控平台故障特征提取方法的功能框架Referring to FIG. 1, the functional framework of a fault feature extraction method for an operation and maintenance management and control platform of the present invention

基于PCA特征提取进行样本空间变换后更有效去除数据冗余。该特征提取过程:1)基于PCA对预处理后数据集S0特征提取。依据PCA原理得出高维样本空间X的协方差矩阵S;推导出S的正交矩阵Q及其特征值λ1≥λ2≥…≥λn;根据管控故障特征提取实际要求设定累计贡献率t的阈值,从而得到其标准正交向量ui,及特征提取后的低维样本空间U={u1,u2,…ud},并得到原始样本xi空间变换后的主分量特征y=(u1,u2,…ud)Txi,形成新的候选特征子集F1。2)基于PCA的自适应二次特征选择。①管控故障PCA特征提取后如需锁定特征子集F1的关键特征,则进入二次特征选择功能模块。二次特征选择采用关联规则特征选择CFS算法,计算提取后特征集的特征相关性,使特征属性具有最大关联—最小冗余,即可以最大程度提高属性特征与类属性关联度,且降低属性与属性之间的冗余,且同时能锁定PCA特征提取后的关键特征子集F2。该功能模块在提高故障分类精度同时,能增强其特征的最大关联—最小冗余性,并锁定在PCA特征提取基础上的关键特征。②当仅需要故障分类,并不需要分析关键特征时,可跳过此功能模块,对管控故障快速分类。3)在上述嵌入自适应二次特征选择功能的特征提取基础上,对形成的管控故障最优特征数据集进行训练,在测试集上得到管控平台故障分类结果。The data redundancy is more effectively removed after sample space transformation based on PCA feature extraction. The feature extraction process: 1) Feature extraction of the preprocessed dataset S 0 based on PCA. According to the PCA principle, the covariance matrix S of the high-dimensional sample space X is obtained; the orthogonal matrix Q of S and its eigenvalues λ 1 ≥λ 2 ≥…≥λ n are derived; the cumulative contribution is set according to the actual requirements of the control fault feature extraction the threshold value of the rate t, so as to obtain its standard orthogonal vector u i , and the low-dimensional sample space U={u 1 , u 2 , ... u d } after feature extraction, and obtain the principal component of the original sample xi after space transformation Feature y=(u 1 , u 2 ,... ud ) T x i , forming a new candidate feature subset F 1 . 2) Adaptive quadratic feature selection based on PCA. ① If the key features of the feature subset F1 need to be locked after the PCA feature extraction of the management and control faults, enter the secondary feature selection function module. The secondary feature selection uses the association rule feature selection CFS algorithm to calculate the feature correlation of the extracted feature set, so that the feature attributes have the maximum correlation-minimum redundancy, that is, it can maximize the correlation between attribute features and class attributes, and reduce the relationship between attributes and attributes. Redundancy between attributes, and at the same time can lock the key feature subset F 2 after PCA feature extraction. While improving the fault classification accuracy, the function module can enhance the maximum correlation-minimum redundancy of its features, and lock the key features based on PCA feature extraction. ② When only fault classification is required, and key features need not be analyzed, this function module can be skipped to quickly classify management and control faults. 3) On the basis of the feature extraction of the above-mentioned embedded adaptive secondary feature selection function, the formed optimal feature data set of management and control faults is trained, and the fault classification result of the management and control platform is obtained on the test set.

2.本发明的一种运维管控平台故障特征提取方法的算法框架2. The algorithm framework of an operation and maintenance management and control platform fault feature extraction method of the present invention

算法基于主成分分析对原始数据集进行特征提取形成特征集F1,并衡量F1中特征Wj与类属性S的关联性U(Wj,S),将U进行降序排列,并计算CFS的特征熵评估值Hs1。计算时采用的搜索策略是启发式序列后向搜索,后向搜索策略流程如图2所示,每次将与类属性相关性评估值较小的特征删除,并再次计算此特征删除后的特征熵评估值Hs2。循环评估Hs当其不小于阈值时,若Hs2≥Hs1,特征子集F1将更新,若Hs2<Hs1,特征子集F1不更新,当Hs小于阈值时跳出循环输出最优特征子集F2。该二次特征选择功能模块能在PCA特征提取基础之上,通过关联规则特征选择进一步锁定最优特征子集的关键特征。其二次特征选择算法的伪代码如下:The algorithm performs feature extraction on the original data set based on principal component analysis to form a feature set F 1 , and measures the correlation U (W j ,S) between the feature W j in F 1 and the class attribute S, sorts U in descending order, and calculates the CFS The feature entropy evaluation value H s1 of . The search strategy used in the calculation is the heuristic sequence backward search. The backward search strategy flow is shown in Figure 2. Each time, the feature with a small correlation evaluation value with the class attribute is deleted, and the feature after this feature is deleted is calculated again. The entropy evaluation value H s2 . Loop evaluation H s when it is not less than the threshold value, if H s2 ≥ H s1 , the feature subset F 1 will be updated, if H s2 <H s1 , the feature subset F 1 will not be updated, and jump out of the loop output when H s is less than the threshold value The optimal feature subset F 2 . The secondary feature selection function module can further lock the key features of the optimal feature subset through association rule feature selection based on PCA feature extraction. The pseudocode of its quadratic feature selection algorithm is as follows:

输入:PCA特征提取后的特征集F1,输出:最优特征集F2Input: feature set F 1 after PCA feature extraction, output: optimal feature set F 2 ,

1.选择PCA特征提取后的全部特征构成特征子集F11. Select all features after PCA feature extraction to form feature subset F 1 ,

2.计算F1中各个特征属性Wj与类属性S的关联性U(Wj,S),2. Calculate the correlation U(W j ,S) between each feature attribute W j and the class attribute S in F 1 ,

3.计算特征熵评估值Hs3. Calculate the feature entropy evaluation value H s ,

4.对每个特征与类属性关联性U(Wj,S)值进行降序排列,Hs1←Hs4. Arrange in descending order the U(W j ,S) value of the correlation between each feature and class attribute, H s1 ←H s ,

5.For Hs1≥δdo,5. For H s1 ≥ δdo,

6.删除F1中一个特征,形成新的特征子集F2,计算特征熵评估值Hs26. Delete a feature in F 1 to form a new feature subset F 2 , calculate the feature entropy evaluation value H s2 ,

7.If Hs2≥Hs1,then F1=F27. If H s2 ≥ H s1 , then F 1 =F 2 ,

8.else,F1不变, 8.else , F1 unchanged,

9.End if,9.End if,

10.Hs1=Hs210. H s1 =H s2 ,

11.End For。11. End For.

发明人采用本发明的一种运维管控平台故障特征提取方法,对特征提取后管控平台识别故障性能进行了对比分析。首先,通过PCA进行特征提取,确定主成分累计贡献率为94%,这是由于当阈值t(threshold=94%),特征维度降到18维,并且故障识别平均准确率达到了98%以上,如图3所示。需要注意的是,阈值t决定了PCA主成分的累计贡献率,虽然当threshold=100%时累计贡献率最大,拥有较高的识别准确率,但与此同时特征的维度也急剧增加。因此,阈值t并不是越高越好,只有达到维度与分类准确率平衡时,才能使分类器的性能最优。经过PCA特征提取后,再次进行18维特征的二次选择,结果显示第1,2,5,6,7,12维的特征间的冗余最小,且与类属性关联性最强。经筛选后,它们为特征提取后的关键特征子集。表1是二次特征选择后的交叉验证结果,基于二次特征选择的6维关键特征子集的对比效果如图4所示,其平均二分类准确率为96.9%,与单纯通过PCA特征提取的分类准确率相差不到1.1%。由于特征维度降到6维,相比单纯进行PCA降维得到的特征维度降低了65%;分类器模型执行时间平均减少31.3%。在管控平台故障分类过程中,可以根据具体需求进行自适应的特征提取与选择。当仅需要故障分类,且对分类精度要求较高,并不需要分析关键特征时,可跳过二次特征选择模块,对管控故障分类。当需要锁定关键特征,且对特征维度要求较高时,可自适应的进入二次特征选择模块,进一步锁定关键特征,同时在测试集上得到管控平台故障分类结果。以上证明了本发明提出的一种运维管控平台故障特征提取方法的可行性与有效性。The inventor adopts a fault feature extraction method for an operation and maintenance management and control platform of the present invention, and conducts a comparative analysis on the fault identification performance of the management and control platform after the feature extraction. First, the feature extraction is carried out through PCA, and the cumulative contribution rate of the principal components is determined to be 94%. This is because when the threshold t (threshold=94%), the feature dimension is reduced to 18 dimensions, and the average fault identification accuracy rate reaches more than 98%. As shown in Figure 3. It should be noted that the threshold t determines the cumulative contribution rate of the PCA principal components. Although the cumulative contribution rate is the largest when the threshold=100%, it has a high recognition accuracy, but at the same time, the dimension of the feature also increases sharply. Therefore, the higher the threshold t is, the better. Only when the dimension and classification accuracy are balanced can the performance of the classifier be optimized. After PCA feature extraction, the secondary selection of 18-dimensional features is performed again. The results show that the 1, 2, 5, 6, 7, and 12-dimensional features have the smallest redundancy and the strongest correlation with class attributes. After screening, they are the key feature subsets after feature extraction. Table 1 shows the cross-validation results after secondary feature selection. The comparison effect of the 6-dimensional key feature subset based on secondary feature selection is shown in Figure 4. The classification accuracy differs by less than 1.1%. Since the feature dimension is reduced to 6 dimensions, the feature dimension is reduced by 65% compared with the simple PCA dimension reduction; the average execution time of the classifier model is reduced by 31.3%. During the fault classification process of the management and control platform, adaptive feature extraction and selection can be performed according to specific needs. When only fault classification is required, the classification accuracy is high, and key features do not need to be analyzed, the secondary feature selection module can be skipped to classify management and control faults. When the key features need to be locked and the feature dimension is required to be high, the secondary feature selection module can be adaptively entered to further lock the key features, and at the same time, the fault classification results of the management and control platform can be obtained on the test set. The above proves the feasibility and effectiveness of a fault feature extraction method for an operation and maintenance management and control platform proposed by the present invention.

表1基于PCA的二次故障特征选择(十折交叉验证)Table 1 PCA-based secondary fault feature selection (ten-fold cross-validation)

PCA提取后特征维度Feature dimension after PCA extraction 交叉验证(%)Cross-validation(%) PCA提取后特征维度Feature dimension after PCA extraction 交叉验证(%)Cross-validation(%) 11 9(90%)9 (90%) 1111 1(10%)1 (10%) 22 10(100%)10 (100%) 1212 10(100%)10 (100%) 33 5(50%)5 (50%) 1313 0(0%)0 (0%) 44 4(40%)4 (40%) 1414 0(0%)0 (0%) 55 10(100%)10 (100%) 1515 0(0%)0 (0%) 66 10(100%)10 (100%) 1616 0(0%)0 (0%) 77 9(90%)9 (90%) 1717 0(0%)0 (0%) 88 7(70%)7 (70%) 1818 0(0%)0 (0%) 99 1(10%)1 (10%) 1010 0(0%)0 (0%)

综上所述,本发明的一种运维管控平台故障特征提取方法,降低了各故障样本空间的特征维度,缩短了训练时间,提高了学习分类器的分类精度。由于其先进行了PCA特征提取,大大降低了管控故障分类的特征维度,减少了计算复杂度。同时,由于其在特征提取后进行自适应二次特征选择,克服了单一特征提取方法不能锁定关键特征的问题,并且使特征间冗余度减少,特征与类属性关联性增强,大大提高了故障分类的精度。To sum up, the method for extracting fault features of an operation and maintenance management and control platform of the present invention reduces the feature dimension of each fault sample space, shortens the training time, and improves the classification accuracy of the learning classifier. Because it performs PCA feature extraction first, it greatly reduces the feature dimension of management and control fault classification and reduces the computational complexity. At the same time, because it performs adaptive secondary feature selection after feature extraction, it overcomes the problem that a single feature extraction method cannot lock key features, and reduces the redundancy between features, enhances the correlation between features and class attributes, and greatly improves faults. Classification accuracy.

本发明的软件程序依据自动化和计算机处理技术编制,是本领域技术人员所熟悉的技术。The software program of the present invention is compiled according to automation and computer processing technology, and is a technology familiar to those skilled in the art.

本发明的实施例并非穷举,本领域技术人员不经过创造性劳动的简单复制和改进,仍属于本发明权利保护的范围。The embodiments of the present invention are not exhaustive, and those skilled in the art can simply copy and improve without creative work, and still fall within the scope of the protection of the present invention.

Claims (1)

1. A fault feature extraction method for an operation and maintenance management and control platform is characterized by comprising the following contents:
1) principal component analysis feature extraction
Principal Component Analysis (PCA) is to perform sample space transformation, determine the projection direction with the largest variance of all original feature vectors by projection, perform feature extraction on the projection direction positioning discrimination vectors, and change the original samples into low-dimensional samples dispersed as much as possible after projection transformation while maintaining the original samplesBefore transformation, the difference of original high-dimensional sample space is set to contain N samples, X belongs to RnEach sample is an Xi=[xi1,...,xin]T∈RnIf the mean vector is M, the corresponding feature vector is Xi=[x1i,...,xni]∈RnAnd the corresponding covariance matrix is formula (1),
Figure FDA0002358650850000011
the distribution variance of the sample on the feature vector, namely the feature value of the covariance matrix of formula (1), the orthogonal matrix obtained by diagonalizing the covariance matrix of formula (1) is formula (2),
Figure FDA0002358650850000012
denotes Q as
Figure FDA0002358650850000013
Where M is the dimension of the orthogonal matrix Q, and PCA derives the eigenvalues λ in the matrix based on Q1≥λ2≥…≥λnAnd calculating the orthonormal eigenvector v corresponding to the eigenvalue1≥v2≥…≥vnObtaining the orthonormal eigenvector u of the covariance matrix S through the eigenvalue of the orthonormal matrix Q and the corresponding orthonormal eigenvector1,u2,…udAs in equation (3) where the orthonormal eigenvector u1,u2,…udCorresponding to the first d largest non-zero eigenvalues of S,
Figure FDA0002358650850000014
setting t to 95%, uiT, the cumulative contribution of the principal components of the spatial samples on the first d axes is 95% of the original data, thus, for any sample xiMapping the space to a reduced-dimension low-dimension sample space U-U1,u2,…udX ofiIs characterized by y ═ u (u)1,u2,…ud)TxiThen y isiFor sample points in a low-dimensional space, through the spatial sample transformation of PCA, the transformed samples can represent 95% of the accumulated contribution rate of the principal components, and the original spatial dimension is reduced from n to d, wherein d is smaller than n, so that the spatial dimension is greatly reduced, and the function of feature extraction is played;
2) quadratic feature selection
After PCA (principal component analysis) feature extraction, embedding a quadratic feature selection algorithm for further obtaining an optimal feature subset and key features of a PCA low-dimensional space, wherein the algorithm is based on Filter-based (Filter) association rule feature selection (CFS), adopts a heuristic sequence back search strategy when carrying out Correlation evaluation on sample features, determines the optimal feature subset through the Correlation sorting of the features,
CFS uses the correlation of the characteristics as an evaluation standard, is a Filter type characteristic selection algorithm, aims to reduce the redundancy between attributes and improve the correlation degree of the attribute characteristics and the class attributes under the corresponding search strategy, achieves the screening of the attributes with high redundancy and the attributes irrelevant to the classes, uses a formula (4) as the evaluation standard, and uses M for evaluating the k characteristics of the characteristic subset SsRepresentation in which the mean of the correlation of the feature attributes with the classes is
Figure FDA0002358650850000021
The mean of the correlation between attributes is used
Figure FDA0002358650850000022
As shown in formula (4), the candidate feature subset determined by the association rule feature selection algorithm can make the feature attribute have the maximum association-minimum redundancy, that is, the association degree between the attribute feature and the class attribute can be improved to the maximum extent, and the redundancy between the attribute and the attribute can be reduced, that is, the evaluation value M in formula (4)sHigher, characteristic property and classMean value of correlation
Figure FDA0002358650850000023
The larger the correlation mean between the attributes is
Figure FDA0002358650850000024
The smaller the size of the tube is,
Figure FDA0002358650850000025
the correlation between the attributes is evaluated by using an information gain algorithm in the association rule feature selection, and the information gain calculation method is a symmetry measurement method, so when two high-order associated features exist in the feature subset S, such as the feature Wi、WjThe method of symmetry uncertainty of formula (5) can be used, the entropy of the features is H (W), the feature association is U, and thus formula (6) is an evaluation function of a subset of features based on the correlation between attributes, when evaluating the value HsWhen raised, the features W in the feature subset SjAnd WiThe correlation decreases, and the correlation with the class attribute increases,
Figure FDA0002358650850000026
Figure FDA0002358650850000027
and embedding a secondary feature selection function in the PCA by adopting a CFS algorithm, then calculating an evaluation result of the CFS based on a heuristic sequence backward search strategy, and screening out an optimal feature subset after sorting.
CN202010015277.7A 2020-01-07 2020-01-07 A fault feature extraction method for operation and maintenance management and control platform Pending CN111242204A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010015277.7A CN111242204A (en) 2020-01-07 2020-01-07 A fault feature extraction method for operation and maintenance management and control platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010015277.7A CN111242204A (en) 2020-01-07 2020-01-07 A fault feature extraction method for operation and maintenance management and control platform

Publications (1)

Publication Number Publication Date
CN111242204A true CN111242204A (en) 2020-06-05

Family

ID=70864621

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010015277.7A Pending CN111242204A (en) 2020-01-07 2020-01-07 A fault feature extraction method for operation and maintenance management and control platform

Country Status (1)

Country Link
CN (1) CN111242204A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085619A (en) * 2020-08-10 2020-12-15 国网上海市电力公司 A Feature Selection Method for Distribution Network Data Optimization
CN112633383A (en) * 2020-12-25 2021-04-09 百度在线网络技术(北京)有限公司 Antique identification method and device, electronic equipment and readable medium
CN113128002A (en) * 2021-03-23 2021-07-16 常州匠心独具智能家居股份有限公司 High-dimensional time series modeling method and system for large-scale distributed system
CN118247782A (en) * 2024-01-16 2024-06-25 无锡商业职业技术学院 A refrigerator intelligent control method based on image recognition and intelligent refrigerator
WO2025130671A1 (en) * 2023-12-19 2025-06-26 北京京东远升科技有限公司 Object detection method and apparatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608004A (en) * 2015-12-17 2016-05-25 云南大学 CS-ANN-based software failure prediction method
CN105703954A (en) * 2016-03-17 2016-06-22 福州大学 Network data flow prediction method based on ARIMA model
CN108319987A (en) * 2018-02-20 2018-07-24 东北电力大学 A kind of filtering based on support vector machines-packaged type combined flow feature selection approach

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105608004A (en) * 2015-12-17 2016-05-25 云南大学 CS-ANN-based software failure prediction method
CN105703954A (en) * 2016-03-17 2016-06-22 福州大学 Network data flow prediction method based on ARIMA model
CN108319987A (en) * 2018-02-20 2018-07-24 东北电力大学 A kind of filtering based on support vector machines-packaged type combined flow feature selection approach

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曹杰: "基于SVM的网络流量特征降维与分类方法研究", 《中国博士学位论文全文数据库 信息科技辑》, no. 3, pages 139 - 1 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085619A (en) * 2020-08-10 2020-12-15 国网上海市电力公司 A Feature Selection Method for Distribution Network Data Optimization
CN112633383A (en) * 2020-12-25 2021-04-09 百度在线网络技术(北京)有限公司 Antique identification method and device, electronic equipment and readable medium
CN112633383B (en) * 2020-12-25 2023-08-18 百度在线网络技术(北京)有限公司 Ancient game authentication method and device, electronic equipment and readable medium
CN113128002A (en) * 2021-03-23 2021-07-16 常州匠心独具智能家居股份有限公司 High-dimensional time series modeling method and system for large-scale distributed system
WO2025130671A1 (en) * 2023-12-19 2025-06-26 北京京东远升科技有限公司 Object detection method and apparatus
CN118247782A (en) * 2024-01-16 2024-06-25 无锡商业职业技术学院 A refrigerator intelligent control method based on image recognition and intelligent refrigerator

Similar Documents

Publication Publication Date Title
US11900598B2 (en) System and method of classification of biological particles
CN111242204A (en) A fault feature extraction method for operation and maintenance management and control platform
CN113489685B (en) Secondary feature extraction and malicious attack identification method based on kernel principal component analysis
US7260259B2 (en) Image segmentation using statistical clustering with saddle point detection
Qureshi et al. Adaptive discriminant wavelet packet transform and local binary patterns for meningioma subtype classification
CN106991446A (en) A kind of embedded dynamic feature selection method of the group policy of mutual information
CN109299664B (en) A Re-ranking Method for Person Re-identification
CN101777125B (en) Method for supervising and classifying complex category of high-resolution remote sensing image
CN109657011A (en) A kind of data digging method and system screening attack of terrorism criminal gang
CN107729377A (en) Customer classification method and system based on data mining
Park A comparative study for outlier detection methods in high dimensional text data
CN111368917B (en) Multi-example integrated learning method for criminal investigation image classification
He et al. An effective clustering scheme for high-dimensional data
CN107704872A (en) A kind of K means based on relatively most discrete dimension segmentation cluster initial center choosing method
CN112306731B (en) Two-stage defect-distinguishing report severity prediction method based on space word vector
Blanchet et al. Triplet Markov fields for the classification of complex structure data
CN115982722B (en) A Vulnerability Classification and Detection Method Based on Decision Tree
CN108460119B (en) System for improving technical support efficiency by using machine learning
CN117220979A (en) Network intrusion detection method integrating frequent item set and K-Means algorithm
Kyperountas et al. Dynamic training using multistage clustering for face recognition
Ng et al. Input dimensionality reduction for radial basis neural network classification problems using sensitivity measure
CN113920573A (en) Face change decoupling relativity relationship verification method based on counterstudy
Zhang et al. Distributed dimensionality reduction of industrial data based on clustering
Feil et al. Introduction to fuzzy data mining methods
CN117951610B (en) Communication signal identification and classification method based on characteristic data analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200605