CN115470844A - Feature extraction and selection method for multi-source heterogeneous data of power system - Google Patents
Feature extraction and selection method for multi-source heterogeneous data of power system Download PDFInfo
- Publication number
- CN115470844A CN115470844A CN202211044436.1A CN202211044436A CN115470844A CN 115470844 A CN115470844 A CN 115470844A CN 202211044436 A CN202211044436 A CN 202211044436A CN 115470844 A CN115470844 A CN 115470844A
- Authority
- CN
- China
- Prior art keywords
- source heterogeneous
- heterogeneous data
- data
- self
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims description 13
- 238000010187 selection method Methods 0.000 title claims description 5
- 238000000034 method Methods 0.000 claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000004927 fusion Effects 0.000 claims abstract description 19
- 238000013528 artificial neural network Methods 0.000 claims abstract description 16
- 238000012216 screening Methods 0.000 claims abstract description 4
- 239000011159 matrix material Substances 0.000 claims description 17
- 239000013598 vector Substances 0.000 claims description 17
- 230000004913 activation Effects 0.000 claims description 9
- 238000011478 gradient descent method Methods 0.000 claims description 9
- 230000009466 transformation Effects 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 2
- 239000000126 substance Substances 0.000 claims description 2
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 15
- 238000004364 calculation method Methods 0.000 description 5
- 238000005457 optimization Methods 0.000 description 5
- 238000013507 mapping Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000010205 computational analysis Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the crossing field of an electric power system and artificial intelligence, and relates to a method for extracting and selecting characteristics of multi-source heterogeneous data of the electric power system, which comprises the steps of taking the multi-source heterogeneous data in the electric power system as input data and constructing a training data set; designing neural networks with different structures for each group of multi-source heterogeneous data, training a self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model; constructing a fusion layer network, and finely adjusting parameters of the whole stacked self-encoder model; and carrying out sparsification treatment on the obtained isomorphic features, calculating to obtain the weight of each feature dimension, and screening out the features with higher weight. The method can comprehensively mine data characteristics, can reflect the semantics of actual data, and can support task requirements by the selected characteristics, thereby greatly improving the completion degree of actual tasks.
Description
Technical Field
The invention belongs to the crossing field of an electric power system and artificial intelligence, and particularly relates to a feature extraction and selection method for multi-source heterogeneous data of the electric power system.
Background
With the rapid development of information technologies such as computers, networks, databases and the like, the information technology is generally applied to various fields of society, and the informatization process of various industries of the society is accelerated, especially in the power industry. In the power industry, with the deep implementation of smart grid construction and the wide use of smart sensing equipment, the data volume is in an explosive growth trend, and the power industry is in the big data era. The national power grid has a plurality of business systems, including enterprise management systems, such as ERP, MES, CRM and other information systems. The information systems have different development cycles and different developers, and the product system structures are various, the coding data structures are different, and the front-end functions and the bottom-layer databases are also different. The data from various sensing devices and various information systems form multi-source heterogeneous power data, which is not beneficial to information sharing and data potential value mining.
The hidden value of multi-source heterogeneous data can be obtained through a data analysis means, and due to the heterogeneity, people need to extract and express the characteristics of the data into a form which can be used in computational analysis. Common feature extraction methods include manual design extraction rules, linear mapping, nonlinear mapping, and the like. The manual design extraction rule is used for carrying out data transformation processing on the design rule according to the structural characteristics of the data, linear mapping is carried out on the data, principal component analysis, linear discriminant analysis and the like, and high-dimensional data are mapped to a low-dimensional space. These methods have certain limitations, such as feature information locality, high computational complexity, inability to reflect semantic information, and the like.
In the operation process of the power system, a plurality of task requirements such as equipment fault diagnosis, fault prediction, health state assessment and the like exist, the completion of the tasks needs the support of power data, and it is a great difficulty how to extract key information of the data from massive multi-source heterogeneous data and use the key information to realize the requirements. The existing feature extraction technologies such as manual extraction, linear mapping and the like have certain limitations such as one side of feature extraction, complex calculation and the like.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention provides a feature extraction and selection method for multi-source heterogeneous data of an electric power system, which can comprehensively mine data features and reflect the semantics of actual data. The selected characteristics can support the task requirements, and the completion degree of the actual task is greatly improved.
The invention can be achieved by adopting the following technical scheme:
a method for extracting and selecting features of multi-source heterogeneous data of a power system, the method comprising:
s1, constructing a training data set by taking multi-source heterogeneous data in an electric power system as input data;
s2, designing neural networks with different structures for each group of multi-source heterogeneous data, training a self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model;
s3, taking the coding characteristics of each group of multi-source heterogeneous data as input data of the stacked self-encoder model, constructing a fusion layer network, eliminating the heterogeneity of the coding characteristics of the multi-source heterogeneous data to obtain isomorphic characteristic expression, and finely adjusting the parameters of the whole stacked self-encoder model;
and S4, performing sparsification processing on the obtained isomorphic features, calculating the weight of each feature dimension, and screening out features with higher weights.
In a preferred technical solution, the step S2 specifically includes the steps of:
constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data;
when each hidden layer of the nth stacked self-encoder is trained, for current input heterogeneous data, carrying out nonlinear transformation on the hidden layer through a weight matrix and an activation function to obtain an output hidden expression;
decoding the implicit expression, transforming and reconstructing by a weight matrix and an activation function to obtain reconstructed output, and solving the error between the original input and the reconstructed output by adopting a gradient descent method; and when the error is 0, training again by taking the reconstructed output as the original input of the next layer to obtain the stacked self-encoder model.
The step S3 comprises the following specific steps:
taking a feedforward neural network as a fusion layer network, performing characteristic fusion on a plurality of groups of multi-source heterogeneous data in the fusion layer network, wherein the feedforward neural network is connected with a stacking self-encoder network of each group of multi-source heterogeneous data;
the fusion layer network is externally connected with a softmax classifier, and the label class probability of the input vector is calculated;
the stacked self-encoder model parameters are fine-tuned using a gradient descent method.
Compared with the prior art, the invention has the following advantages and beneficial effects:
the invention provides a method for extracting and selecting characteristics of multi-source heterogeneous data of an electric power system, which comprises the steps of constructing an artificial neural network by using a deep learning method, designing neural networks with different structures for each group of multi-source heterogeneous data, training a self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, so as to extract and select the characteristics of the multi-source heterogeneous data in the electric power system, comprehensively mining the characteristics of the data, reflecting the semantic property of actual data, supporting the task requirement by the selected characteristics, and greatly improving the completion degree of the actual task.
Drawings
In order to more clearly illustrate the embodiments or technical solutions of the present invention, the drawings used in the embodiments or technical solutions of the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a flow chart of a method for extracting and selecting features of multi-source heterogeneous data of an electric power system according to an embodiment of the present invention;
FIG. 2 is a block diagram of the structure of a stacked self-encoder in an embodiment of the invention;
FIG. 3 is a flowchart of a stacked self-encoder algorithm routine in an embodiment of the present invention;
FIG. 4 is a diagram of a fusion layer structure in an example of the present invention.
Detailed Description
The technical solutions of the present invention will be described in further detail with reference to the accompanying drawings and examples, and it is obvious that the described examples are some, but not all, examples of the present invention, and the embodiments of the present invention are not limited thereto. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Example 1:
as shown in fig. 1, a specific embodiment of the present invention is a method for extracting and selecting multi-source heterogeneous data features of an electrical power system, including the following steps:
s1, introducing multi-source heterogeneous data in the power system as input data, and constructing a training data set. The multi-source heterogeneous data comprises voltage, current, active power output, on-off state and the like detected by each sensor, and also comprises audio data and image data brought by an audio and video monitoring system, and further comprises a text record of system operation.
The input data is specifically defined as:
X=[x 1 ,x 2 ,…,x n ]
x n ={data 1 ,data 2 ,…,data m }
wherein X represents a collection of multi-source heterogeneous data, X n Representing nth set of multi-source heterogeneous data, data m And representing the data of the mth dimension of the nth group of multi-source heterogeneous data, wherein the dimensions of each group of multi-source heterogeneous data are different due to the heterogeneity.
S2, designing neural networks with different structures for each group of multi-source heterogeneous data through the stacked self-encoder model, training the self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model.
The self-encoder is a neural network model, and the function of the self-encoder is to perform characteristic learning on input information by taking the input information as a learning target. The stacked self-encoder is a stacked self-encoder, and the output of each self-encoder hidden layer is used as the input of a second self-encoder, so as to increase the characterization capability of the model. As shown in FIG. 2, the stacked self-encoder model consists of a multi-layer neural network, i.e., input x passes through 3 hidden layers h 1 、h 2 、h 3 To finally obtain a reconstructed 4 th layer output layer h 4 The meaning and structure of which is the same as that of the input x. It should be noted that the neural network structure is presented here as an example only, and is not representative of the neural network structure of all the self-encoders. As shown in FIG. 3, isA flow chart of an algorithm for a stacked self-encoder.
The step S2 specifically comprises the following steps:
and constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data. And constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data, namely the hidden layers and the neuron nodes are different in number. The number of hidden layers contained in the neural network of the n-th group of heterogeneous data is defined as m n The ith hidden layer is expressed asThe connection weight between the ith hidden layer and the previous layer is expressed as
Training each hidden layer of the n-th stacked self-encoder for the currently input heterogeneous data x n At the hidden layer via weight matrix W 1 And carrying out nonlinear transformation on the activation function f (-) to obtain an output implicit expression:
h=f(W 1 x n +c)
wherein h is the implicit expression of the output, c is the bias term, and the activation function f (-) is the sigmoid function.
Decoding the implicit expression, transforming and reconstructing by a weight matrix and an activation function to obtain reconstructed output, and solving the error between the original input and the reconstructed output by adopting a gradient descent method; and when the error is 0, training again by taking the reconstructed output as the original input of the next layer to obtain the stacked self-encoder model.
Specifically, the implicit expression is decoded by a weight matrix W 2 And activation functionTransforming and reconstructing to obtain reconstructed output:
The error of the original input and the reconstructed output is as follows:
The error is an optimization target, a gradient descent method is adopted for solving, and after the partial derivative solving is completed, the weight and the bias can be updated:
wherein lr is the learning rate, and the reconstruction error is obtained by continuously updating the weight and the offsetAnd the value is 0, and the optimization is completed.
After the optimization is completed, the reconstructed output of the layer can be used as the original input of the next layer for retraining, and finally, a stacked self-encoder model is obtained. And inputting each group of multi-source heterogeneous data into a corresponding stacked self-encoder model to obtain corresponding feature encoding output for use in subsequent calculation tasks. And extracting feature expression of the multi-source heterogeneous data through the model.
And S3, taking the coding characteristics of each group of multi-source heterogeneous data as input data of the stacked self-encoder model, constructing a fusion layer network, eliminating the heterogeneity of the coding characteristics of the multi-source heterogeneous data to obtain isomorphic characteristic expression, and finely adjusting the parameters of the whole stacked self-encoder model.
The step S3 specifically includes the following steps:
s31, taking a feedforward neural network as a fusion layer network, performing feature fusion on multiple groups of multi-source heterogeneous data in the fusion layer network, wherein the feedforward neural network is connected with a stacking self-encoder network of each group of multi-source heterogeneous data, and the weight is T n And the weights are shared so as to eliminate the heterogeneity and strong relevance of the extracted features. As shown in fig. 4, a block diagram of a fusion layer and stacked self-encoder.
S32, externally connecting a softmax classifier to the fusion layer network, and calculating an input vector h n The tag class probability of (1). The sofxmax classifier can compute a score for each label category and map all the scores to a probability value, i.e., classification probability. Defining the neuron at the uppermost layer of the stacked self-encoder of the nth group of multi-source heterogeneous data as h n The label information of the fusion layer is p, and a loss function can be defined as follows:
wherein n represents the number of multi-source heterogeneous data groups, M represents the number of training samples, b is a bias term, y (i) Represents a sample x (i) Y denotes a probability event in the conditional probability density function, Y = Y (i) Means that the probability event at this time is y (i) 。Represents the nth sub-network pair input x (i) Is output from the top layer. For k-sorted tasks, a vector h is input n The probability of belonging to label class i is:
wherein, b i ,b l A vector of the offset is represented, and,the l-th row vector, T, representing the weight matrix T i The ith row vector of the weight matrix T is represented.
And S33, fine tuning the model parameters of the stacked self-encoder by using a gradient descent method, wherein the whole network adopts supervised fine tuning. Preferably, the parameters of each stacked encoder are iteratively adjusted in turn, one model is adjusted each time, and the parameters of other model networks are fixed until the parameters of all stacked self-encoder models are adjusted.
The model loss function obtained according to the previous step, namely the optimization objective, is as follows:
solving by adopting a gradient descent method, and updating the parameters of the model after the partial derivative solution is completed, namely the weight W shown in the step 1 、W 2 And offsets b, c.
And S4, carrying out sparsification treatment on the obtained isomorphic features by adopting a structured sparsification method, calculating the weight of each feature dimension, and screening out the features with higher weight.
Firstly, defining data of isomorphic characteristics, and then performing formula expression and calculation according to defined variables.
In particular, the isomorphic characteristic expression obtained in step 3 is definedThe method comprises a p-dimensional feature vector,is a label, X = (X) 1 ,x 2 ,…,x n ) Representing the input training data matrix, Y = (Y) 1 ,y 2 ,…,y n ) Representing a matrix of labels; setting p-dimensional feature vector to be divided into K feature groups, K j Representing the number of feature dimensions of the jth group; beta is a l =(β l1 ,β l2 ,…,β lj ) Representing the weight coefficient vector, β, for the l-th class lj Representing the vector of the number of subsystems corresponding to the jth group.
By means of an objective functionSelecting the features of the first category to obtain the features with weight not 0 and the target function
Wherein the content of the first and second substances,in order to be a function of the loss,as a regular term, the loss function is the same as that of step 3.2:
the regularization term is described as follows:
wherein λ is 1 And λ 2 Being a regularizing term coefficient, a hyper-parameter omega j Is the weight, y, of the jth feature group (i) Representing a sample input x (i) T is the weight matrix, b is the bias term. The regularization term includes two parts: the L1 norm is used as a punishment item, and the L2 norm has a thinning effect. The same group of characteristic vectors and each group of characteristic vectors can simultaneously generate sparse effect through the regular terms, the weight value of certain preset characteristic dimensions is 0, and the longitude and latitude are obtainedAnd selecting the features with the weight value not being 0.
It is noted that the features hereinThe encoding characteristic data obtained by extracting the stacked self-encoder model can be used for the calculation task of a computer and cannot correspond to the characteristic attributes in the original data, such as voltage, current and the like. Weight beta l The weight coefficients of all the characteristics in the process of optimizing the objective function are continuously updated in the optimization process after initialization.
After the selected features are obtained, the features can be input into the task model to verify the effect according to actual task requirements. In the embodiment, a fault classification task is taken as an example, and the multi-source heterogeneous data of the power system, such as the on-off state, the active and reactive power output, the text record, the image and the like, are subjected to the steps to realize the feature extraction and selection of the data; the selected features are input into the sofxmax classifier for calculation, the probability of equipment failure can be obtained, and the accuracy of the task can be greatly improved.
It should be understood that parts of the specification not set forth in detail are well within the prior art.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (8)
1. A feature extraction and selection method for multi-source heterogeneous data of an electric power system is characterized by comprising the following steps:
s1, introducing multi-source heterogeneous data in an electric power system as input data, and constructing a training data set;
s2, designing neural networks with different structures for each group of multi-source heterogeneous data through a stacked self-encoder model, training the self-encoder model by adopting a layer-by-layer training algorithm to obtain a trained stacked self-encoder model, and extracting the encoding characteristics of each group of multi-source heterogeneous data through the trained stacked self-encoder model;
s3, taking the coding characteristics of each group of multi-source heterogeneous data as input data of the stacked self-encoder model, constructing a fusion layer network, eliminating the heterogeneity of the coding characteristics of the multi-source heterogeneous data to obtain isomorphic characteristic expression, and finely adjusting the parameters of the whole stacked self-encoder model;
and S4, performing sparsification treatment on the obtained isomorphic features by adopting a structured sparse method, calculating to obtain the weight of each feature dimension, screening out features with higher weight, and completing the feature extraction and selection of the multi-source heterogeneous data.
2. The method for extracting and selecting the characteristics of the multi-source heterogeneous data of the electric power system according to claim 1, wherein the multi-source heterogeneous data comprises voltage, current, active output and on-off state detected by each sensor, and further comprises audio data, image data and text records of operation of the electric power system in an audio and video monitoring system.
3. The method for extracting and selecting features of multi-source heterogeneous data of an electric power system according to claim 1, wherein the stacked self-encoder is a stacked plurality of self-encoders, and an output of a hidden layer of each self-encoder is used as an input of another connected self-encoder.
4. The method for extracting and selecting features of multi-source heterogeneous data of an electric power system according to claim 3, wherein the step S2 specifically comprises the steps of:
constructing n heterogeneous stacked self-encoders according to the input n groups of multi-source heterogeneous data;
when each hidden layer of the nth stacked self-encoder is trained, for current input heterogeneous data, carrying out nonlinear transformation on the hidden layer through a weight matrix and an activation function to obtain an output implicit expression;
decoding the implicit expression, transforming and reconstructing by a weight matrix and an activation function to obtain reconstructed output, and solving the error between the original input and the reconstructed output by adopting a gradient descent method; when the error is 0, training again by taking the reconstructed output as the original input of the next layer to obtain a trained stacked self-encoder model;
and inputting each group of multi-source heterogeneous data into the trained stacked self-encoder model to obtain corresponding characteristic encoding output.
5. The method for extracting and selecting multi-source heterogeneous data features of the power system according to claim 4, wherein the implicit expression of the output is as follows:
h=f(W 1 x n +c)
where h is the implicit representation of the output, c is the bias term, W 1 F (-) is sigmoid function as weight matrix;
the reconstruction output is:
6. The method for extracting and selecting the multi-source heterogeneous data features of the power system according to claim 1, wherein the step S3 specifically comprises the following steps:
s31, taking a feedforward neural network as a fusion layer network, performing feature fusion on multiple groups of multi-source heterogeneous data in the fusion layer network, and connecting the feedforward neural network with a stacked self-encoder network of each group of multi-source heterogeneous data;
s32, externally connecting a softmax classifier to the fusion layer network, and calculating the label category probability of the input vector;
and S33, fine adjustment is carried out on the parameters of the stacked self-encoder model by using a gradient descent method.
7. The method for extracting and selecting the multi-source heterogeneous data features of the power system according to claim 6, wherein the fine-tuning of the parameters of the stacked self-encoder model by using a gradient descent method comprises the following steps: and iteratively adjusting the parameters of each self-encoder model in turn, adjusting one self-encoder model each time, and fixing the parameters of other self-encoder models until the parameters of all the self-encoder models are adjusted.
8. The method for extracting and selecting the multi-source heterogeneous data features of the power system according to claim 1, wherein the step S4 comprises:
defining data of isomorphic characteristics, defining isomorphic characteristic feature expressionThe p-dimensional feature vector is contained in the image,is a label, X = (X) 1 ,x 2 ,…,x n ) Representing the input training data matrix, Y = (Y) 1 ,y 2 ,…,y n ) Representing a tag matrix; setting p-dimensional feature vectors to be divided into K feature groups, K j Representing the number of feature dimensions of the jth group; beta is a l =(β l1 ,β l2 ,…,β lj ) Representing the weight coefficient vector, β, for the l-th class lj Representing a vector of the number of subsystems corresponding to the jth group;
by means of an objective functionSelecting the characteristics of the first category to obtain the characteristics with weight value not being 0 and the target function
Wherein the content of the first and second substances,in order to be a function of the loss,is a regular term;
wherein λ is 1 And λ 2 As a regular term coefficient, a hyperparameter omega j Is the weight of the jth feature group, y (i) And T is a weight matrix, and b is an offset item.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211044436.1A CN115470844A (en) | 2022-08-30 | 2022-08-30 | Feature extraction and selection method for multi-source heterogeneous data of power system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211044436.1A CN115470844A (en) | 2022-08-30 | 2022-08-30 | Feature extraction and selection method for multi-source heterogeneous data of power system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115470844A true CN115470844A (en) | 2022-12-13 |
Family
ID=84368436
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211044436.1A Pending CN115470844A (en) | 2022-08-30 | 2022-08-30 | Feature extraction and selection method for multi-source heterogeneous data of power system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115470844A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116258579A (en) * | 2023-04-28 | 2023-06-13 | 成都新希望金融信息有限公司 | Training method of user credit scoring model and user credit scoring method |
CN116303687A (en) * | 2023-05-12 | 2023-06-23 | 烟台黄金职业学院 | Intelligent management method and system for engineering cost data |
CN116502092A (en) * | 2023-06-26 | 2023-07-28 | 国网智能电网研究院有限公司 | Semantic alignment method, device, equipment and storage medium for multi-source heterogeneous data |
-
2022
- 2022-08-30 CN CN202211044436.1A patent/CN115470844A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116258579A (en) * | 2023-04-28 | 2023-06-13 | 成都新希望金融信息有限公司 | Training method of user credit scoring model and user credit scoring method |
CN116258579B (en) * | 2023-04-28 | 2023-08-04 | 成都新希望金融信息有限公司 | Training method of user credit scoring model and user credit scoring method |
CN116303687A (en) * | 2023-05-12 | 2023-06-23 | 烟台黄金职业学院 | Intelligent management method and system for engineering cost data |
CN116502092A (en) * | 2023-06-26 | 2023-07-28 | 国网智能电网研究院有限公司 | Semantic alignment method, device, equipment and storage medium for multi-source heterogeneous data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tian et al. | An intrusion detection approach based on improved deep belief network | |
CN115470844A (en) | Feature extraction and selection method for multi-source heterogeneous data of power system | |
CN113905391B (en) | Integrated learning network traffic prediction method, system, equipment, terminal and medium | |
Dai et al. | Incremental learning using a grow-and-prune paradigm with efficient neural networks | |
CN112967088A (en) | Marketing activity prediction model structure and prediction method based on knowledge distillation | |
US20230267302A1 (en) | Large-Scale Architecture Search in Graph Neural Networks via Synthetic Data | |
CN115661550A (en) | Graph data class imbalance classification method and device based on generation countermeasure network | |
Guo et al. | PILAE: A non-gradient descent learning scheme for deep feedforward neural networks | |
CN111178986B (en) | User-commodity preference prediction method and system | |
Sokkhey et al. | Development and optimization of deep belief networks applied for academic performance prediction with larger datasets | |
Liu et al. | EACP: An effective automatic channel pruning for neural networks | |
CN116206158A (en) | Scene image classification method and system based on double hypergraph neural network | |
Leke et al. | A deep learning-cuckoo search method for missing data estimation in high-dimensional datasets | |
Bhadoria et al. | Bunch graph based dimensionality reduction using auto-encoder for character recognition | |
Liu et al. | Uncertainty propagation method for high-dimensional black-box problems via Bayesian deep neural network | |
Cheng et al. | Active broad learning with multi-objective evolution for data stream classification | |
Delgado et al. | Preliminary Results of Applying Transformers to Geoscience and Earth Science Data | |
Guo et al. | End-to-end variational graph clustering with local structural preservation | |
Zhang | [Retracted] Analysis of College Students’ Network Moral Behavior by the History of Ideological and Political Education under Deep Learning | |
Hemkiran et al. | Design of Automatic Credit Card Approval System Using Machine Learning | |
Lazebnik et al. | Knowledge-integrated autoencoder model | |
Wijayanto et al. | Predicting future potential flight routes via inductive graph representation learning | |
CN115098787B (en) | Article recommendation method based on cosine ranking loss and virtual edge map neural network | |
KR102579685B1 (en) | Method for constructing facial movement control parameters of digital human with control information | |
Chen et al. | LPR‐MLP: A Novel Health Prediction Model for Transmission Lines in Grid Sensor Networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |