CN115470844A - Feature extraction and selection method for multi-source heterogeneous data of power system - Google Patents

Feature extraction and selection method for multi-source heterogeneous data of power system Download PDF

Info

Publication number
CN115470844A
CN115470844A CN202211044436.1A CN202211044436A CN115470844A CN 115470844 A CN115470844 A CN 115470844A CN 202211044436 A CN202211044436 A CN 202211044436A CN 115470844 A CN115470844 A CN 115470844A
Authority
CN
China
Prior art keywords
source heterogeneous
heterogeneous data
data
self
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211044436.1A
Other languages
Chinese (zh)
Inventor
龙云
刘璐豪
卢有飞
梁雪青
邹时容
赵宏伟
张扬
张少凡
吴任博
陈明辉
蔡燕春
刘璇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Guangzhou Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority to CN202211044436.1A priority Critical patent/CN115470844A/en
Publication of CN115470844A publication Critical patent/CN115470844A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the crossing field of an electric power system and artificial intelligence, and relates to a method for extracting and selecting characteristics of multi-source heterogeneous data of the electric power system, which comprises the steps of taking the multi-source heterogeneous data in the electric power system as input data and constructing a training data set; designing neural networks with different structures for each group of multi-source heterogeneous data, training a self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model; constructing a fusion layer network, and finely adjusting parameters of the whole stacked self-encoder model; and carrying out sparsification treatment on the obtained isomorphic features, calculating to obtain the weight of each feature dimension, and screening out the features with higher weight. The method can comprehensively mine data characteristics, can reflect the semantics of actual data, and can support task requirements by the selected characteristics, thereby greatly improving the completion degree of actual tasks.

Description

Feature extraction and selection method for multi-source heterogeneous data of power system
Technical Field
The invention belongs to the crossing field of an electric power system and artificial intelligence, and particularly relates to a feature extraction and selection method for multi-source heterogeneous data of the electric power system.
Background
With the rapid development of information technologies such as computers, networks, databases and the like, the information technology is generally applied to various fields of society, and the informatization process of various industries of the society is accelerated, especially in the power industry. In the power industry, with the deep implementation of smart grid construction and the wide use of smart sensing equipment, the data volume is in an explosive growth trend, and the power industry is in the big data era. The national power grid has a plurality of business systems, including enterprise management systems, such as ERP, MES, CRM and other information systems. The information systems have different development cycles and different developers, and the product system structures are various, the coding data structures are different, and the front-end functions and the bottom-layer databases are also different. The data from various sensing devices and various information systems form multi-source heterogeneous power data, which is not beneficial to information sharing and data potential value mining.
The hidden value of multi-source heterogeneous data can be obtained through a data analysis means, and due to the heterogeneity, people need to extract and express the characteristics of the data into a form which can be used in computational analysis. Common feature extraction methods include manual design extraction rules, linear mapping, nonlinear mapping, and the like. The manual design extraction rule is used for carrying out data transformation processing on the design rule according to the structural characteristics of the data, linear mapping is carried out on the data, principal component analysis, linear discriminant analysis and the like, and high-dimensional data are mapped to a low-dimensional space. These methods have certain limitations, such as feature information locality, high computational complexity, inability to reflect semantic information, and the like.
In the operation process of the power system, a plurality of task requirements such as equipment fault diagnosis, fault prediction, health state assessment and the like exist, the completion of the tasks needs the support of power data, and it is a great difficulty how to extract key information of the data from massive multi-source heterogeneous data and use the key information to realize the requirements. The existing feature extraction technologies such as manual extraction, linear mapping and the like have certain limitations such as one side of feature extraction, complex calculation and the like.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a feature extraction and selection method for multi-source heterogeneous data of an electric power system, which can comprehensively mine data features and reflect the semantics of actual data. The selected characteristics can support the task requirements, and the completion degree of the actual task is greatly improved.
The invention can be achieved by adopting the following technical scheme:
a method for extracting and selecting features of multi-source heterogeneous data of a power system, the method comprising:
s1, constructing a training data set by taking multi-source heterogeneous data in an electric power system as input data;
s2, designing neural networks with different structures for each group of multi-source heterogeneous data, training a self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model;
s3, taking the coding characteristics of each group of multi-source heterogeneous data as input data of the stacked self-encoder model, constructing a fusion layer network, eliminating the heterogeneity of the coding characteristics of the multi-source heterogeneous data to obtain isomorphic characteristic expression, and finely adjusting the parameters of the whole stacked self-encoder model;
and S4, performing sparsification processing on the obtained isomorphic features, calculating the weight of each feature dimension, and screening out features with higher weights.
In a preferred technical solution, the step S2 specifically includes the steps of:
constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data;
when each hidden layer of the nth stacked self-encoder is trained, for current input heterogeneous data, carrying out nonlinear transformation on the hidden layer through a weight matrix and an activation function to obtain an output hidden expression;
decoding the implicit expression, transforming and reconstructing by a weight matrix and an activation function to obtain reconstructed output, and solving the error between the original input and the reconstructed output by adopting a gradient descent method; and when the error is 0, training again by taking the reconstructed output as the original input of the next layer to obtain the stacked self-encoder model.
The step S3 comprises the following specific steps:
taking a feedforward neural network as a fusion layer network, performing characteristic fusion on a plurality of groups of multi-source heterogeneous data in the fusion layer network, wherein the feedforward neural network is connected with a stacking self-encoder network of each group of multi-source heterogeneous data;
the fusion layer network is externally connected with a softmax classifier, and the label class probability of the input vector is calculated;
the stacked self-encoder model parameters are fine-tuned using a gradient descent method.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides a method for extracting and selecting characteristics of multi-source heterogeneous data of an electric power system, which comprises the steps of constructing an artificial neural network by using a deep learning method, designing neural networks with different structures for each group of multi-source heterogeneous data, training a self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, so as to extract and select the characteristics of the multi-source heterogeneous data in the electric power system, comprehensively mining the characteristics of the data, reflecting the semantic property of actual data, supporting the task requirement by the selected characteristics, and greatly improving the completion degree of the actual task.
Drawings
In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the embodiments or technical solutions of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow chart of a method for extracting and selecting features of multi-source heterogeneous data of an electric power system according to an embodiment of the present invention;
FIG. 2 is a block diagram of the structure of a stacked self-encoder in an embodiment of the invention;
FIG. 3 is a flowchart of a stacked self-encoder algorithm routine in an embodiment of the present invention;
FIG. 4 is a diagram of a fusion layer structure in an example of the present invention.
Detailed Description
The technical solutions of the present invention will be described in further detail with reference to the accompanying drawings and examples, and it is obvious that the described examples are some, but not all, examples of the present invention, and the embodiments of the present invention are not limited thereto. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Example 1:
as shown in fig. 1, a specific embodiment of the present invention is a method for extracting and selecting multi-source heterogeneous data features of an electrical power system, including the following steps:
s1, introducing multi-source heterogeneous data in the power system as input data, and constructing a training data set. The multi-source heterogeneous data comprises voltage, current, active power output, on-off state and the like detected by each sensor, and also comprises audio data and image data brought by an audio and video monitoring system, and further comprises a text record of system operation.
The input data is specifically defined as:
X=[x 1 ,x 2 ,…,x n ]
x n ={data 1 ,data 2 ,…,data m }
wherein X represents a collection of multi-source heterogeneous data, X n Representing nth set of multi-source heterogeneous data, data m And representing the data of the mth dimension of the nth group of multi-source heterogeneous data, wherein the dimensions of each group of multi-source heterogeneous data are different due to the heterogeneity.
S2, designing neural networks with different structures for each group of multi-source heterogeneous data through the stacked self-encoder model, training the self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model.
The self-encoder is a neural network model, and the function of the self-encoder is to perform characteristic learning on input information by taking the input information as a learning target. The stacked self-encoder is a stacked self-encoder, and the output of each self-encoder hidden layer is used as the input of a second self-encoder, so as to increase the characterization capability of the model. As shown in FIG. 2, the stacked self-encoder model consists of a multi-layer neural network, i.e., input x passes through 3 hidden layers h 1 、h 2 、h 3 To finally obtain a reconstructed 4 th layer output layer h 4 The meaning and structure of which is the same as that of the input x. It should be noted that the neural network structure is presented here as an example only, and is not representative of the neural network structure of all the self-encoders. As shown in FIG. 3, isA flow chart of an algorithm for a stacked self-encoder.
The step S2 specifically comprises the following steps:
and constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data. And constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data, namely the hidden layers and the neuron nodes are different in number. The number of hidden layers contained in the neural network of the n-th group of heterogeneous data is defined as m n The ith hidden layer is expressed as
Figure BDA0003821930030000041
The connection weight between the ith hidden layer and the previous layer is expressed as
Figure BDA0003821930030000042
Training each hidden layer of the n-th stacked self-encoder for the currently input heterogeneous data x n At the hidden layer via weight matrix W 1 And carrying out nonlinear transformation on the activation function f (-) to obtain an output implicit expression:
h=f(W 1 x n +c)
wherein h is the implicit expression of the output, c is the bias term, and the activation function f (-) is the sigmoid function.
Decoding the implicit expression, transforming and reconstructing by a weight matrix and an activation function to obtain reconstructed output, and solving the error between the original input and the reconstructed output by adopting a gradient descent method; and when the error is 0, training again by taking the reconstructed output as the original input of the next layer to obtain the stacked self-encoder model.
Specifically, the implicit expression is decoded by a weight matrix W 2 And activation function
Figure BDA0003821930030000043
Transforming and reconstructing to obtain reconstructed output:
Figure BDA0003821930030000044
wherein
Figure BDA0003821930030000045
B is the offset term, which is the reconstructed output from the encoder.
The error of the original input and the reconstructed output is as follows:
Figure BDA0003821930030000046
wherein, x is the original input,
Figure BDA0003821930030000047
is the reconstructed output from the encoder.
The error is an optimization target, a gradient descent method is adopted for solving, and after the partial derivative solving is completed, the weight and the bias can be updated:
Figure BDA0003821930030000048
Figure BDA0003821930030000049
wherein lr is the learning rate, and the reconstruction error is obtained by continuously updating the weight and the offset
Figure BDA0003821930030000051
And the value is 0, and the optimization is completed.
After the optimization is completed, the reconstructed output of the layer can be used as the original input of the next layer for retraining, and finally, a stacked self-encoder model is obtained. And inputting each group of multi-source heterogeneous data into a corresponding stacked self-encoder model to obtain corresponding feature encoding output for use in subsequent calculation tasks. And extracting feature expression of the multi-source heterogeneous data through the model.
And S3, taking the coding characteristics of each group of multi-source heterogeneous data as input data of the stacked self-encoder model, constructing a fusion layer network, eliminating the heterogeneity of the coding characteristics of the multi-source heterogeneous data to obtain isomorphic characteristic expression, and finely adjusting the parameters of the whole stacked self-encoder model.
The step S3 specifically includes the following steps:
s31, taking a feedforward neural network as a fusion layer network, performing feature fusion on multiple groups of multi-source heterogeneous data in the fusion layer network, wherein the feedforward neural network is connected with a stacking self-encoder network of each group of multi-source heterogeneous data, and the weight is T n And the weights are shared so as to eliminate the heterogeneity and strong relevance of the extracted features. As shown in fig. 4, a block diagram of a fusion layer and stacked self-encoder.
S32, externally connecting a softmax classifier to the fusion layer network, and calculating an input vector h n The tag class probability of (1). The sofxmax classifier can compute a score for each label category and map all the scores to a probability value, i.e., classification probability. Defining the neuron at the uppermost layer of the stacked self-encoder of the nth group of multi-source heterogeneous data as h n The label information of the fusion layer is p, and a loss function can be defined as follows:
Figure BDA0003821930030000052
wherein n represents the number of multi-source heterogeneous data groups, M represents the number of training samples, b is a bias term, y (i) Represents a sample x (i) Y denotes a probability event in the conditional probability density function, Y = Y (i) Means that the probability event at this time is y (i)
Figure BDA0003821930030000053
Represents the nth sub-network pair input x (i) Is output from the top layer. For k-sorted tasks, a vector h is input n The probability of belonging to label class i is:
Figure BDA0003821930030000054
wherein, b i ,b l A vector of the offset is represented, and,
Figure BDA0003821930030000055
the l-th row vector, T, representing the weight matrix T i The ith row vector of the weight matrix T is represented.
And S33, fine tuning the model parameters of the stacked self-encoder by using a gradient descent method, wherein the whole network adopts supervised fine tuning. Preferably, the parameters of each stacked encoder are iteratively adjusted in turn, one model is adjusted each time, and the parameters of other model networks are fixed until the parameters of all stacked self-encoder models are adjusted.
The model loss function obtained according to the previous step, namely the optimization objective, is as follows:
Figure BDA0003821930030000061
solving by adopting a gradient descent method, and updating the parameters of the model after the partial derivative solution is completed, namely the weight W shown in the step 1 、W 2 And offsets b, c.
And S4, carrying out sparsification treatment on the obtained isomorphic features by adopting a structured sparsification method, calculating the weight of each feature dimension, and screening out the features with higher weight.
Firstly, defining data of isomorphic characteristics, and then performing formula expression and calculation according to defined variables.
In particular, the isomorphic characteristic expression obtained in step 3 is defined
Figure BDA0003821930030000062
The method comprises a p-dimensional feature vector,
Figure BDA0003821930030000063
is a label, X = (X) 1 ,x 2 ,…,x n ) Representing the input training data matrix, Y = (Y) 1 ,y 2 ,…,y n ) Representing a matrix of labels; setting p-dimensional feature vector to be divided into K feature groups, K j Representing the number of feature dimensions of the jth group; beta is a l =(β l1l2 ,…,β lj ) Representing the weight coefficient vector, β, for the l-th class lj Representing the vector of the number of subsystems corresponding to the jth group.
By means of an objective function
Figure BDA0003821930030000064
Selecting the features of the first category to obtain the features with weight not 0 and the target function
Figure BDA0003821930030000065
Figure BDA0003821930030000066
Wherein the content of the first and second substances,
Figure BDA0003821930030000067
in order to be a function of the loss,
Figure BDA0003821930030000068
as a regular term, the loss function is the same as that of step 3.2:
Figure BDA0003821930030000069
the regularization term is described as follows:
Figure BDA00038219300300000610
wherein λ is 1 And λ 2 Being a regularizing term coefficient, a hyper-parameter omega j Is the weight, y, of the jth feature group (i) Representing a sample input x (i) T is the weight matrix, b is the bias term. The regularization term includes two parts: the L1 norm is used as a punishment item, and the L2 norm has a thinning effect. The same group of characteristic vectors and each group of characteristic vectors can simultaneously generate sparse effect through the regular terms, the weight value of certain preset characteristic dimensions is 0, and the longitude and latitude are obtainedAnd selecting the features with the weight value not being 0.
It is noted that the features herein
Figure BDA00038219300300000611
The encoding characteristic data obtained by extracting the stacked self-encoder model can be used for the calculation task of a computer and cannot correspond to the characteristic attributes in the original data, such as voltage, current and the like. Weight beta l The weight coefficients of all the characteristics in the process of optimizing the objective function are continuously updated in the optimization process after initialization.
After the selected features are obtained, the features can be input into the task model to verify the effect according to actual task requirements. In the embodiment, a fault classification task is taken as an example, and the multi-source heterogeneous data of the power system, such as the on-off state, the active and reactive power output, the text record, the image and the like, are subjected to the steps to realize the feature extraction and selection of the data; the selected features are input into the sofxmax classifier for calculation, the probability of equipment failure can be obtained, and the accuracy of the task can be greatly improved.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (8)

1. A feature extraction and selection method for multi-source heterogeneous data of an electric power system is characterized by comprising the following steps:
s1, introducing multi-source heterogeneous data in an electric power system as input data, and constructing a training data set;
s2, designing neural networks with different structures for each group of multi-source heterogeneous data through a stacked self-encoder model, training the self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model;
s3, taking the coding characteristics of each group of multi-source heterogeneous data as input data of the stacked self-encoder model, constructing a fusion layer network, eliminating the heterogeneity of the coding characteristics of the multi-source heterogeneous data to obtain isomorphic characteristic expression, and finely adjusting the parameters of the whole stacked self-encoder model;
and S4, performing sparsification treatment on the obtained isomorphic features by adopting a structured sparse method, calculating to obtain the weight of each feature dimension, screening out features with higher weight, and completing the feature extraction and selection of the multi-source heterogeneous data.
2. The method for extracting and selecting the characteristics of the multi-source heterogeneous data of the electric power system according to claim 1, wherein the multi-source heterogeneous data comprises voltage, current, active output and on-off state detected by each sensor, and further comprises audio data, image data and text records of operation of the electric power system in an audio and video monitoring system.
3. The method for extracting and selecting features of multi-source heterogeneous data of an electric power system according to claim 1, wherein the stacked self-encoder is a stacked plurality of self-encoders, and an output of a hidden layer of each self-encoder is used as an input of another connected self-encoder.
4. The method for extracting and selecting features of multi-source heterogeneous data of an electric power system according to claim 3, wherein the step S2 specifically comprises the steps of:
constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data;
when each hidden layer of the nth stacked self-encoder is trained, for current input heterogeneous data, carrying out nonlinear transformation on the hidden layer through a weight matrix and an activation function to obtain an output implicit expression;
decoding the implicit expression, transforming and reconstructing by a weight matrix and an activation function to obtain reconstructed output, and solving the error between the original input and the reconstructed output by adopting a gradient descent method; when the error is 0, training again by taking the reconstructed output as the original input of the next layer to obtain a trained stacked self-encoder model;
and inputting each group of multi-source heterogeneous data into the trained stacked self-encoder model to obtain corresponding characteristic encoding output.
5. The method for extracting and selecting multi-source heterogeneous data features of the power system according to claim 4, wherein the implicit expression of the output is as follows:
h=f(W 1 x n +c)
where h is the implicit representation of the output, c is the bias term, W 1 F (-) is sigmoid function as weight matrix;
the reconstruction output is:
Figure FDA0003821930020000021
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003821930020000022
is the reconstructed output from the encoder, b is the offset term, W 2 Is a weight matrix sum
Figure FDA0003821930020000023
Is an activation function.
6. The method for extracting and selecting the multi-source heterogeneous data features of the power system according to claim 1, wherein the step S3 specifically comprises the following steps:
s31, taking a feedforward neural network as a fusion layer network, performing feature fusion on multiple groups of multi-source heterogeneous data in the fusion layer network, and connecting the feedforward neural network with a stacked self-encoder network of each group of multi-source heterogeneous data;
s32, externally connecting a softmax classifier to the fusion layer network, and calculating the label category probability of the input vector;
and S33, fine adjustment is carried out on the parameters of the stacked self-encoder model by using a gradient descent method.
7. The method for extracting and selecting the multi-source heterogeneous data features of the power system according to claim 6, wherein the fine-tuning of the parameters of the stacked self-encoder model by using a gradient descent method comprises the following steps: and iteratively adjusting the parameters of each self-encoder model in turn, adjusting one self-encoder model each time, and fixing the parameters of other self-encoder models until the parameters of all the self-encoder models are adjusted.
8. The method for extracting and selecting the multi-source heterogeneous data features of the power system according to claim 1, wherein the step S4 comprises:
defining data of isomorphic characteristics, defining isomorphic characteristic feature expression
Figure FDA0003821930020000024
The p-dimensional feature vector is contained in the image,
Figure FDA0003821930020000025
is a label, X = (X) 1 ,x 2 ,…,x n ) Representing the input training data matrix, Y = (Y) 1 ,y 2 ,…,y n ) Representing a tag matrix; setting p-dimensional feature vectors to be divided into K feature groups, K j Representing the number of feature dimensions of the jth group; beta is a l =(β l1l2 ,…,β lj ) Representing the weight coefficient vector, β, for the l-th class lj Representing a vector of the number of subsystems corresponding to the jth group;
by means of an objective function
Figure FDA0003821930020000026
Selecting the characteristics of the first category to obtain the characteristics with weight value not being 0 and the target function
Figure FDA0003821930020000027
Figure FDA0003821930020000028
Wherein the content of the first and second substances,
Figure FDA0003821930020000029
in order to be a function of the loss,
Figure FDA00038219300200000210
is a regular term;
Figure FDA00038219300200000211
Figure FDA00038219300200000212
wherein λ is 1 And λ 2 As a regular term coefficient, a hyperparameter omega j Is the weight of the jth feature group, y (i) And T is a weight matrix, and b is an offset item.
CN202211044436.1A 2022-08-30 2022-08-30 Feature extraction and selection method for multi-source heterogeneous data of power system Pending CN115470844A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211044436.1A CN115470844A (en) 2022-08-30 2022-08-30 Feature extraction and selection method for multi-source heterogeneous data of power system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211044436.1A CN115470844A (en) 2022-08-30 2022-08-30 Feature extraction and selection method for multi-source heterogeneous data of power system

Publications (1)

Publication Number Publication Date
CN115470844A true CN115470844A (en) 2022-12-13

Family

ID=84368436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211044436.1A Pending CN115470844A (en) 2022-08-30 2022-08-30 Feature extraction and selection method for multi-source heterogeneous data of power system

Country Status (1)

Country Link
CN (1) CN115470844A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258579A (en) * 2023-04-28 2023-06-13 成都新希望金融信息有限公司 Training method of user credit scoring model and user credit scoring method
CN116303687A (en) * 2023-05-12 2023-06-23 烟台黄金职业学院 Intelligent management method and system for engineering cost data
CN116502092A (en) * 2023-06-26 2023-07-28 国网智能电网研究院有限公司 Semantic alignment method, device, equipment and storage medium for multi-source heterogeneous data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258579A (en) * 2023-04-28 2023-06-13 成都新希望金融信息有限公司 Training method of user credit scoring model and user credit scoring method
CN116258579B (en) * 2023-04-28 2023-08-04 成都新希望金融信息有限公司 Training method of user credit scoring model and user credit scoring method
CN116303687A (en) * 2023-05-12 2023-06-23 烟台黄金职业学院 Intelligent management method and system for engineering cost data
CN116502092A (en) * 2023-06-26 2023-07-28 国网智能电网研究院有限公司 Semantic alignment method, device, equipment and storage medium for multi-source heterogeneous data

Similar Documents

Publication Publication Date Title
Tian et al. An intrusion detection approach based on improved deep belief network
CN115470844A (en) Feature extraction and selection method for multi-source heterogeneous data of power system
CN113905391B (en) Integrated learning network traffic prediction method, system, equipment, terminal and medium
Dai et al. Incremental learning using a grow-and-prune paradigm with efficient neural networks
CN112967088A (en) Marketing activity prediction model structure and prediction method based on knowledge distillation
US20230267302A1 (en) Large-Scale Architecture Search in Graph Neural Networks via Synthetic Data
CN115661550A (en) Graph data class imbalance classification method and device based on generation countermeasure network
Guo et al. PILAE: A non-gradient descent learning scheme for deep feedforward neural networks
CN111178986B (en) User-commodity preference prediction method and system
Sokkhey et al. Development and optimization of deep belief networks applied for academic performance prediction with larger datasets
Liu et al. EACP: An effective automatic channel pruning for neural networks
CN116206158A (en) Scene image classification method and system based on double hypergraph neural network
Leke et al. A deep learning-cuckoo search method for missing data estimation in high-dimensional datasets
Bhadoria et al. Bunch graph based dimensionality reduction using auto-encoder for character recognition
Liu et al. Uncertainty propagation method for high-dimensional black-box problems via Bayesian deep neural network
Cheng et al. Active broad learning with multi-objective evolution for data stream classification
Delgado et al. Preliminary Results of Applying Transformers to Geoscience and Earth Science Data
Guo et al. End-to-end variational graph clustering with local structural preservation
Zhang [Retracted] Analysis of College Students’ Network Moral Behavior by the History of Ideological and Political Education under Deep Learning
Hemkiran et al. Design of Automatic Credit Card Approval System Using Machine Learning
Lazebnik et al. Knowledge-integrated autoencoder model
Wijayanto et al. Predicting future potential flight routes via inductive graph representation learning
CN115098787B (en) Article recommendation method based on cosine ranking loss and virtual edge map neural network
KR102579685B1 (en) Method for constructing facial movement control parameters of digital human with control information
Chen et al. LPR‐MLP: A Novel Health Prediction Model for Transmission Lines in Grid Sensor Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination