CN117521897A - Method, device, medium and equipment for predicting load distribution of energy station - Google Patents

Method, device, medium and equipment for predicting load distribution of energy station Download PDF

Info

Publication number
CN117521897A
CN117521897A CN202311499749.0A CN202311499749A CN117521897A CN 117521897 A CN117521897 A CN 117521897A CN 202311499749 A CN202311499749 A CN 202311499749A CN 117521897 A CN117521897 A CN 117521897A
Authority
CN
China
Prior art keywords
target
characteristic
feature
data
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311499749.0A
Other languages
Chinese (zh)
Inventor
兰剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Anji Jiayu Big Data Technology Services Co ltd
Original Assignee
Zhejiang Anji Jiayu Big Data Technology Services Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Anji Jiayu Big Data Technology Services Co ltd filed Critical Zhejiang Anji Jiayu Big Data Technology Services Co ltd
Priority to CN202311499749.0A priority Critical patent/CN117521897A/en
Publication of CN117521897A publication Critical patent/CN117521897A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Marketing (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Optimization (AREA)
  • Operations Research (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Algebra (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a method, a device, a medium and equipment for predicting energy station load distribution, wherein the method comprises the following steps: acquiring a plurality of historical load data and historical time corresponding to each historical load data; acquiring corresponding sample time characteristics based on each historical time point; performing characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types; based on the target characteristic data set corresponding to each historical load data, the sample time characteristic of each historical time and the target time characteristic of each target time, carrying out load data prediction on each target time to obtain load prediction data corresponding to each target time; and carrying out quantile prediction on each piece of load prediction data by using a quantile regression prediction mode to obtain a target load value of each target quantile so as to obtain a prediction result of load distribution. The load distribution condition in a period of time in the future can be accurately predicted.

Description

Method, device, medium and equipment for predicting load distribution of energy station
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method, an apparatus, a medium, and a device for predicting load distribution of an energy station.
Background
With the continuous development of economy, the energy production and consumption modes are greatly changed, and the energy industry is carrying new missions of improving energy efficiency, guaranteeing energy safety, promoting new energy consumption, promoting environmental protection and the like.
In order to meet the increasing demand of economy, the construction of various energy stations is increasing, and after the energy stations are used, the load condition of the energy stations is usually predicted in order to ensure the reliable and stable operation of the energy stations.
However, the existing load prediction method has the problem that the prediction result is inaccurate, and the load distribution situation of the energy stations in a future period of time cannot be accurately predicted.
Disclosure of Invention
In view of the above, the invention provides a method, a device, a medium and equipment for predicting the load distribution of an energy station, which mainly aims to solve the problem that the load distribution of the energy station in a future period of time cannot be accurately predicted at present.
In order to solve the above problems, the present application provides a method for predicting load distribution of an energy station, including:
Acquiring a plurality of historical load data and historical time corresponding to each historical load data;
acquiring sample time characteristics corresponding to each historical time based on each historical time;
performing characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
based on the target characteristic data set corresponding to each historical load data, the sample time characteristic of each historical time and the target time characteristic of each target time in the time period to be predicted, carrying out load data prediction on each target time to obtain predicted load data corresponding to each target time;
and carrying out quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain a target load value corresponding to each target quantile in a time period to be predicted so as to obtain a prediction result of load distribution.
Optionally, the performing feature calculation processing on each historical load data to obtain a target sample feature data set corresponding to each historical load data, where the target sample feature data set includes a plurality of target feature types, specifically includes:
based on the preset feature types and a plurality of time ranges, different feature types corresponding to different time ranges under each feature type are determined so as to obtain a plurality of initial feature types;
Performing feature calculation processing on each historical load data based on each initial feature type to obtain initial sample feature data corresponding to each historical load data under each initial feature type;
based on the characteristic data of each initial sample, screening and obtaining a plurality of first characteristic types from the initial characteristic types by utilizing a characteristic screening mode of characteristic countermeasure verification;
screening by utilizing a characteristic screening mode of characteristic importance based on the characteristic data of each initial sample corresponding to each first characteristic category so as to screen and obtain a target characteristic category from each first characteristic category;
and taking each initial sample characteristic data corresponding to each target characteristic type as target sample characteristic data to construct and obtain a target sample characteristic data set corresponding to each historical load data.
Optionally, based on the feature data of each initial sample, a feature screening manner of feature challenge verification is utilized to screen and obtain a plurality of first feature types from each initial feature type, which specifically includes:
dividing initial sample characteristic data corresponding to each historical load data to obtain a training set and a testing set, and configuring corresponding labels for initial sample characteristics in the training set and the testing set;
Carrying out data combination on the training set and the test set, and carrying out classification model training based on the combined initial sample characteristic data and labels corresponding to the initial characteristic data to obtain a current classification model;
calculating a current AUC value based on the current classification model, and comparing the current AUC value with a preset AUC threshold;
and under the condition that the current AUC value is larger than the AUC threshold, calculating the feature importance corresponding to each initial feature type, deleting each initial sample feature corresponding to the initial feature type with the highest feature importance, and retraining the classification model until the current AUC value of the classification model obtained based on retraining is smaller than the preset AUC threshold, and taking each initial feature type which is not deleted as a first feature type.
Optionally, the screening to obtain the target feature class from the first feature classes based on the feature data of each initial sample corresponding to each first feature class by using a feature screening manner of feature importance specifically includes:
respectively training a tree model aiming at each initial sample characteristic data corresponding to the same first characteristic category and historical load data corresponding to each initial sample characteristic data to obtain the importance of the original characteristic corresponding to each first characteristic category;
For each initial sample characteristic data corresponding to the same first characteristic type, carrying out scrambling processing on the corresponding relation between each initial sample characteristic data and the historical load data to obtain each historical load data and each initial sample characteristic data with the scrambled corresponding relation;
based on the history load data and the initial sample feature data with disturbed corresponding relation, training a tree model to obtain the importance of the current feature corresponding to each first feature type;
determining target feature importance of each first feature class based on the original feature importance and the current feature importance corresponding to the same first feature class;
screening each first characteristic category based on the target characteristic importance corresponding to each first characteristic category to obtain a plurality of target characteristic categories;
optionally, the feature types include: statistical features, hysteresis features, and differential features;
the statistical features include: variance, maximum, minimum, average;
the time profile includes: holiday characteristics, holiday characteristics.
Optionally, the predicting load data for each target time based on the target feature data set corresponding to each historical load data, the sample time feature of each historical time, and the target time feature of each target time in the to-be-predicted time period, to obtain predicted load data corresponding to each target time specifically includes:
Training the initial point prediction model based on the target characteristic data set corresponding to each historical load data and the sample time characteristics of each historical time to obtain a target point prediction model;
and predicting and obtaining predicted load data corresponding to each target time by using the target point prediction model based on the target time characteristics of each target time.
Optionally, the predicting the load data by using a quantile regression prediction mode to obtain a load value corresponding to each target quantile in a time period to be predicted, so as to obtain a prediction result of load distribution, which specifically includes:
determining sample load values corresponding to a plurality of target quantiles based on each historical load data;
training the initial quantile regression prediction model based on each historical load data and the sample load value corresponding to each target quantile to obtain a target quantile regression prediction model;
and carrying out quantile regression prediction on each piece of predicted load data based on a target quantile regression prediction model to obtain a target load value corresponding to each target quantile so as to obtain a predicted result of load distribution.
In order to solve the above problems, the present application provides a prediction apparatus for load distribution of an energy station, including:
The historical data acquisition module is used for acquiring a plurality of historical load data and the historical time corresponding to each historical load data;
the first characteristic acquisition module is used for acquiring sample time characteristics corresponding to each historical time based on each historical time point;
the second characteristic acquisition module is used for carrying out characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
the first prediction module is used for predicting the load data of each target time based on a target characteristic data set corresponding to each historical load data, a sample time characteristic of each historical time and a target time characteristic of each target time in a time period to be predicted, and obtaining predicted load data corresponding to each target time;
and the second prediction module is used for carrying out quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain target load values corresponding to each target quantile in a time period to be predicted so as to obtain a prediction result of load distribution.
In order to solve the above-mentioned problems, the present application provides a storage medium storing a computer program which, when executed by a processor, implements the steps of the method for predicting energy station load distribution described in any one of the above.
In order to solve the above problems, the present application provides an electronic device, at least including a memory, and a processor, where the memory stores a computer program, and the processor implements the steps of the method for predicting energy station load distribution according to any one of the above when executing the computer program on the memory.
According to the load distribution prediction method, device, medium and equipment for the energy station, the characteristic data is obtained through extraction by utilizing the historical load data, then the load data corresponding to each time point in the future time period can be accurately predicted and obtained according to the characteristic data, and the load value corresponding to each quantile in the future time period can be accurately predicted and obtained according to each load data, so that the prediction of the load distribution condition in the future time period is realized, the problem that the load distribution prediction result of the energy station is inaccurate in the prior art is solved, and the guarantee is provided for the follow-up policy declaration based on the reasonable and accurate prediction result.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present invention more readily apparent.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a flowchart of a method for predicting energy station load distribution according to an embodiment of the present application;
FIG. 2 is a block diagram of a load distribution prediction apparatus for an energy station according to another embodiment of the present application;
fig. 3 is a block diagram of an electronic device according to another embodiment of the present application.
Detailed Description
Various aspects and features of the present application are described herein with reference to the accompanying drawings.
It should be understood that various modifications may be made to the embodiments of the application herein. Therefore, the above description should not be taken as limiting, but merely as exemplification of the embodiments. Other modifications within the scope and spirit of this application will occur to those skilled in the art.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and, together with a general description of the application given above and the detailed description of the embodiments given below, serve to explain the principles of the application.
These and other characteristics of the present application will become apparent from the following description of a preferred form of embodiment, given as a non-limiting example, with reference to the accompanying drawings.
It is also to be understood that, although the present application has been described with reference to some specific examples, those skilled in the art can certainly realize many other equivalent forms of the present application.
The foregoing and other aspects, features, and advantages of the present application will become more apparent in light of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present application will be described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the application, which can be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the application with unnecessary or excessive detail. Therefore, specific structural and functional details disclosed herein are not intended to be limiting, but merely serve as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present application in virtually any appropriately detailed structure.
The specification may use the word "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments as per the application.
The embodiment of the application provides a method for predicting load distribution of an energy station, which can be specifically applied to electronic equipment such as a terminal, a server and the like, as shown in fig. 1, and the method in the example comprises the following steps:
step S101, acquiring a plurality of historical load data and historical time corresponding to each historical load data;
in the implementation process, load data of each time point of the energy station in a period of historical time can be collected aiming at the period of historical time, so that a plurality of historical load data and historical time corresponding to each historical load data are obtained. In this embodiment, the energy station may specifically be a charging station. The length of the historical time period can be adjusted and set according to actual needs, for example, 7 days, 15 days, 20 days, 30 days or the like. That is, historical load data may be collected over 7 days and daily, or historical load data may be collected over 15 days and daily.
Step S102, acquiring sample time characteristics corresponding to each historical time based on each historical time;
in this step, the time profile includes: holiday characteristics, holiday characteristics. I.e. daily holiday category characteristics may be obtained.
Step S103, performing characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
in the implementation process, the historical load data can be correspondingly calculated according to each initial characteristic type, so that a plurality of initial sample characteristic data corresponding to each initial characteristic type are obtained, namely, an initial sample characteristic set corresponding to each historical load data and containing a plurality of initial characteristic types is obtained. And then screening each initial characteristic type based on each initial sample characteristic data to obtain a plurality of target characteristic types, and taking a plurality of initial sample characteristic data corresponding to the target characteristic types as target sample characteristic data to obtain a target sample characteristic data set corresponding to each historical load data.
In this embodiment, during feature type screening, challenge verification may be specifically used to screen out feature types with inconsistent distributions of the training set and the test set, so as to avoid overfitting. Meanwhile, the method can be further combined with a feature screening method Null Importance, and important feature types are reserved so as to eliminate randomness and deviation in feature selection.
Step S104, carrying out load data prediction on each target time based on a target characteristic data set corresponding to each historical load data, sample time characteristics of each historical time and target time characteristics of each target time in a time period to be predicted, and obtaining predicted load data corresponding to each target time;
in the specific implementation process, specifically, the load data can be respectively predicted based on each target time characteristic by a point prediction model obtained through pre-training, so as to obtain predicted load data corresponding to each target time.
Step 105, performing quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain a target load value corresponding to each target quantile in a time period to be predicted, so as to obtain a prediction result of load distribution.
In the specific implementation process, the quantile prediction is performed based on each predicted load data by a quantile regression model obtained through pre-training, so that a load data value corresponding to each target quantile is obtained. The target quantile can be in particular quantile, quartile and the like.
According to the prediction method for the load distribution of the energy station, the characteristic data are obtained through extraction of the historical load data, then the load data corresponding to each time point in the future time period can be accurately predicted according to the characteristic data, and the load value corresponding to each quantile in the future time period can be obtained according to accurate prediction of each load data.
Based on the foregoing embodiment, a further embodiment of the present application provides a method for predicting load distribution of an energy station, in this embodiment, before performing feature calculation processing on each historical load data, each historical feature data may be preprocessed, so that each preprocessed historical load data is more reasonable and accurate, and a foundation is laid for obtaining sample feature data based on accurate calculation of each historical load data. The specific pretreatment process is as follows: performing anomaly detection on each historical load data by adopting a quarter bit-distance IQR detection method, and determining a plurality of abnormal historical load data; and then, adjusting each abnormal historical load data to a preset load data interval to obtain adjusted historical load data corresponding to each abnormal historical load data, so that the characteristic calculation processing can be carried out by utilizing the non-abnormal historical load data and the adjusted historical load data.
In this embodiment, when step S103 is executed, that is, when feature calculation processing is performed, a target sample feature data set including a plurality of target feature types corresponding to each history load data is obtained, the specific procedure is as follows:
Step S1031, based on the preset feature types and a plurality of time ranges, determining different feature types corresponding to different time ranges under each feature type to obtain a plurality of initial feature types;
in this step, the feature types include: statistical features, hysteresis features, and differential features. The statistical features include: variance, maximum, minimum, average. That is, for different time ranges, the same feature type may correspond to multiple feature types. For example, for the feature type of the maximum value, there may be a plurality of feature types as follows: maximum within 7 days, maximum within 5 days, maximum within 3 days, etc. Similarly, for the feature type of the average value, the following feature types may be associated: 7-balance average, 5-balance average, 3-balance average, etc. Whereby by pre-selecting several different time ranges several initial feature categories corresponding to the respective feature types can be obtained. For example, corresponding to statistical features: 30 days of variance, 7 days of variance, 5 days of variance, 3 days of variance, 7 days of maximum value, 5 days of maximum value, 3 days of maximum value, 7 days of minimum value, 5 days of minimum value, 3 days of minimum value, 7 days of average value, 5 days of average value, 3 days of average value and the like. Corresponding to the hysteresis feature: a 10 day hysteresis value, a 7 day hysteresis value, a 5 day hysteresis value, a 3 day hysteresis value, and the like. Corresponding to the differential feature: a 10-day differential value, a 7-day differential value, a 5-day differential value, a 3-day differential value, and the like.
Step S1032, performing feature calculation processing on each historical load data based on each initial feature type to obtain initial sample feature data corresponding to each historical load data under each initial feature type;
in this step, after each initial feature type is obtained, feature calculation processing may be performed on each historical load data for each initial feature type, so as to obtain a set of initial sample feature data corresponding to each initial feature type, where each initial sample feature data in the set of initial sample feature data corresponds to each historical load data one by one.
For example, if there are 100 pieces of historical load data, 7-day average calculation, 5-day variance calculation, 3-day variance calculation, 7-day hysteresis value calculation, 5-day hysteresis value calculation, 10-day differential value calculation, 2-day differential value calculation, and the like may be performed for the 100 pieces of historical load data, respectively, so as to obtain several initial sample characteristic data corresponding to each piece of historical load data. That is, among the 100 pieces of historical load data, each piece of historical load data corresponds to 8 initial sample feature data of 8 feature types, namely 7-day average value, 5-day variance value, 3-day variance value, 7-day hysteresis value, 5-day hysteresis value, 10-day variance value, 2-day variance value.
Step S1033, based on the characteristic data of each initial sample, screening and obtaining a plurality of first characteristic types from the initial characteristic types by utilizing a characteristic screening mode of characteristic countermeasure verification;
in the specific implementation process, the feature types with inconsistent distribution of the training set and the test set can be screened out by adopting countermeasure verification, so that overfitting is avoided. The specific screening process is as follows:
dividing initial sample feature data corresponding to each historical load data to obtain a training set and a testing set, and configuring corresponding labels for each initial sample feature in the training set and the testing set;
in this step, specifically, tag 1 may be configured for the training set, and tag 0 may be configured for the test set.
Step two, carrying out data combination on the training set and the test set, and carrying out classification model training based on the combined initial sample characteristic data and labels corresponding to the initial characteristic data to obtain a current classification model;
step three, calculating to obtain a current AUC value based on the current classification model, and comparing the current AUC value with a preset AUC threshold;
in this step, the AUC value represents the area enclosed by the coordinate axis under the working characteristic curve (i.e., ROC curve) of the subject. That is, the current classification model obtained based on training may draw an ROC curve corresponding to the current classification model, and then an AUC value corresponding to the current classification model may be calculated from the ROC curve and the coordinate axis.
And step four, calculating the feature importance corresponding to each initial feature type under the condition that the current AUC value is larger than the AUC threshold value, deleting each initial sample feature corresponding to the initial feature type with the highest feature importance, and retraining the classification model until the current AUC value of the classification model obtained based on retraining is smaller than the preset AUC threshold value, and taking each initial feature type which is not deleted as a first feature type.
In this embodiment, the principle of screening each initial feature class by challenge verification is as follows: the method comprises the steps of respectively dividing data sets of initial sample feature data corresponding to the same initial feature type, acquiring a training set and a testing set by the sets, then training a two-classifier model by utilizing the initial sample feature data in the two data sets, if the classification model acquired by training can well distinguish the initial sample feature data in the training set and the testing set, indicating that the distribution gap of the initial feature type on the testing set and the training set is overlarge, if the sample feature data corresponding to the initial feature type is added during load distribution prediction, the sample feature data corresponding to the feature type is excessively fitted during model training, and generalization capability is influenced, so that the initial sample feature data corresponding to the feature type needs to be removed.
Step S1034, screening by utilizing a characteristic screening mode of characteristic importance based on the characteristic data of each initial sample corresponding to each first characteristic category so as to screen and obtain a target characteristic category from each first characteristic category;
in the implementation process of the step, after each first feature type is obtained, a Null Importance feature can be utilized to screen each first feature type in a screening method, so that a plurality of target feature types corresponding to each feature type are obtained. The principle of the screening method is as follows: the importance of each feature class is evaluated by performing model training by disturbing the correspondence between each initial sample feature data and historical load data for that feature class and calculating the performance loss of the model. The feature screening method in the step can evaluate the importance of each feature type more accurately, and is beneficial to eliminating randomness and deviation in feature selection. Specifically, the specific process of screening and obtaining the target feature class from each first feature class is as follows:
step one, respectively training a tree model aiming at each initial sample characteristic data corresponding to the same first characteristic type and historical load data corresponding to each initial sample characteristic data to obtain the importance of the original characteristic corresponding to each first characteristic type;
Step two, for each initial sample characteristic data corresponding to the same first characteristic type, carrying out scrambling processing on the corresponding relation between each initial sample characteristic data and the historical load data to obtain each historical load data and each initial sample characteristic data with the scrambled corresponding relation;
in this step, the disturbing the correspondence relation specifically means: for example, there are 10 pieces of history load data, each of the 10 pieces of history load data corresponds to a characteristic category of the load average value of the first five days, the first piece of history load data corresponds to the load average value of the first five days is a, the second piece of history load corresponds to the load average value of the first five days is b, and the third piece of history load data corresponds to the load average value of the first five days is c. The disturbing processing is to make the first historical load data correspond to the load average value c of the first five days, the second historical load data correspond to the load average value a of the first five days, and the third historical load data correspond to the load average value b of the first five days, so that the historical load data and the initial sample characteristic data with disturbed corresponding relations are obtained.
Step three, training a tree model based on the history load data and the initial sample characteristic data with disturbed corresponding relations to obtain the current characteristic importance corresponding to each first characteristic type;
Determining target feature importance of each first feature class based on the original feature importance and the current feature importance corresponding to the same first feature class;
in this step, the importance Score of the target feature can be calculated and obtained by using the following formula percentile
Wherein Score percentile Representing the importance of the target feature; importance of real Representing the importance of the original features; percentile (Importance) shuffle C) represents the current feature importance.
In this step, the principle of calculating the importance of the target feature is: and dividing the feature importance of the true corresponding relation (the true corresponding relation between the historical load data and the initial sample feature data) by the ratio of the feature importance of the disturbed corresponding relation to obtain the target feature importance.
Step five, screening each first characteristic type based on the target characteristic importance corresponding to each first characteristic type to obtain a plurality of target characteristic types corresponding to each characteristic type;
in the implementation process of the step, a plurality of first feature types with highest importance can be determined as target feature types, the first feature types can be ranked according to the order of the importance of the target features from high to low, and then the first feature types with preset numbers before ranking are determined as target feature types.
Step S1035, using each initial sample feature data corresponding to each target feature type as a target sample feature data, so as to construct and obtain a target sample feature data set corresponding to each history load data. .
In this step, after determining a plurality of target feature types corresponding to each feature type, initial sample feature data corresponding to each target feature type may be used as target sample feature data, and thus, each history load data corresponds to target sample feature data of each target feature type, thereby obtaining a target sample feature data set corresponding to each history load data. And a foundation is laid for the subsequent prediction of load distribution based on each sample characteristic data set.
Based on the foregoing embodiments, another embodiment of the present application provides a method for predicting load distribution of an energy station, where in this embodiment, when load data is predicted for each target time, predicted load data corresponding to each target time may be obtained; the load data is predicted by using a point prediction model obtained by training in advance. That is, training the initial point prediction model based on the target feature data set corresponding to each historical load data and the sample time feature of each historical time to obtain a target point prediction model; and then, based on the target time characteristics of each target time, predicting and obtaining predicted load data corresponding to each target time by using the target point prediction model. In the step, the characteristic and the true value are directly fitted when the point prediction model is constructed.
In this embodiment, when quantile prediction is performed to obtain a load value corresponding to each target quantile in a period to be predicted, the specific process is as follows: determining sample load values corresponding to a plurality of target quantiles based on each historical load data; training the initial quantile regression prediction model based on each historical load data and the sample load value corresponding to each target quantile to obtain a target quantile regression prediction model; and carrying out quantile regression prediction on each piece of predicted load data based on a target quantile regression prediction model to obtain a target load value corresponding to each target quantile so as to obtain a predicted result of load distribution. Namely, taking a predicted result of the point prediction model as input of a quantile regression model, taking a residual error as a fitting object, and constructing a probability prediction model of the residual error relative to the characteristic and the load value, namely constructing and obtaining the quantile regression prediction model. In this embodiment, when the quantile regression model is trained, the model parameters are specifically adjusted by using a bayesian optimization algorithm.
After the target load values corresponding to the respective target quantiles are obtained by prediction, the load distribution condition can be determined, for example, the target load value corresponding to the quantiles is obtained by prediction as A, which means that 50% of the load values are smaller than the target load value A and 50% of the load values are larger than the target load value A in the period to be predicted. For another example, when the target load value corresponding to the first fourth score Q1 is predicted to be B, the target load value corresponding to the second fourth score Q2 is C, and the target load value corresponding to the third fourth score Q3 is D, it is indicated that 25% of the load values are smaller than the target load value B, 25% of the load values are in the interval range of [ target load value B, target load value C ], 25% of the load values are in the interval range of [ target load value C, target load value D), and 25% of the load values are larger than the target load value D in the period to be predicted.
Another embodiment of the present application provides a device for predicting load distribution of an energy station, as shown in fig. 2, including:
the historical data acquisition module 11 is used for acquiring a plurality of historical load data and the historical time corresponding to each historical load data;
a first feature obtaining module 12, configured to obtain a sample time feature corresponding to each historical time based on each historical time point;
a second feature obtaining module 13, configured to perform feature calculation processing on each historical load data, to obtain a target sample feature data set corresponding to each historical load data, where the target sample feature data set includes a plurality of target feature types;
a first prediction module 14, configured to predict load data for each target time based on a target feature data set corresponding to each historical load data, a sample time feature of each historical time, and a target time feature of each target time in a time period to be predicted, to obtain predicted load data corresponding to each target time;
the second prediction module 15 is configured to predict the quantile of each predicted load data by using a quantile regression prediction mode, so as to obtain a target load value corresponding to each target quantile in a time period to be predicted, so as to obtain a prediction result of load distribution.
In this embodiment, in a specific implementation process, the second feature acquisition module is configured to specifically be:
based on the preset feature types and a plurality of time ranges, different feature types corresponding to different time ranges under each feature type are determined so as to obtain a plurality of initial feature types;
performing feature calculation processing on each historical load data based on each initial feature type to obtain initial sample feature data corresponding to each historical load data under each initial feature type;
based on the characteristic data of each initial sample, screening and obtaining a plurality of first characteristic types from the initial characteristic types by utilizing a characteristic screening mode of characteristic countermeasure verification;
screening by utilizing a characteristic screening mode of characteristic importance based on the characteristic data of each initial sample corresponding to each first characteristic category so as to screen and obtain a target characteristic category from each first characteristic category;
and taking each initial sample characteristic data corresponding to each target characteristic type as target sample characteristic data to construct and obtain a target sample characteristic data set corresponding to each historical load data.
In a specific implementation process of this embodiment, the second feature acquisition module is specifically configured to: dividing initial sample characteristic data corresponding to each historical load data to obtain a training set and a testing set, and configuring corresponding labels for initial sample characteristics in the training set and the testing set;
Carrying out data combination on the training set and the test set, and carrying out classification model training based on the combined initial sample characteristic data and labels corresponding to the initial characteristic data to obtain a current classification model;
calculating a current AUC value based on the current classification model, and comparing the current AUC value with a preset AUC threshold;
and under the condition that the current AUC value is larger than the AUC threshold, calculating the feature importance corresponding to each initial feature type, deleting each initial sample feature corresponding to the initial feature type with the highest feature importance, and retraining the classification model until the current AUC value of the classification model obtained based on retraining is smaller than the preset AUC threshold, and taking each initial feature type which is not deleted as a first feature type.
In the implementation process, the second obtaining module is specifically configured to:
respectively training a tree model aiming at each initial sample characteristic data corresponding to the same first characteristic category and historical load data corresponding to each initial sample characteristic data to obtain the importance of the original characteristic corresponding to each first characteristic category;
for each initial sample characteristic data corresponding to the same first characteristic type, carrying out scrambling processing on the corresponding relation between each initial sample characteristic data and the historical load data to obtain each historical load data and each initial sample characteristic data with the scrambled corresponding relation;
Based on the history load data and the initial sample feature data with disturbed corresponding relation, training a tree model to obtain the importance of the current feature corresponding to each first feature type;
determining target feature importance of each first feature class based on the original feature importance and the current feature importance corresponding to the same first feature class;
and screening each first characteristic category based on the target characteristic importance corresponding to each first characteristic category to obtain a plurality of target characteristic categories.
In a specific implementation process of this embodiment, the feature types include: statistical features, hysteresis features, and differential features;
the statistical features include: variance, maximum, minimum, average;
the time profile includes: holiday characteristics, holiday characteristics.
In a specific implementation process of this embodiment, the first prediction module is specifically configured to: training the initial point prediction model based on the target characteristic data set corresponding to each historical load data and the sample time characteristics of each historical time to obtain a target point prediction model; and predicting and obtaining predicted load data corresponding to each target time by using the target point prediction model based on the target time characteristics of each target time.
In a specific implementation process of this embodiment, the second prediction module is specifically configured to: determining sample load values corresponding to a plurality of target quantiles based on each historical load data; training the initial quantile regression prediction model based on each historical load data and the sample load value corresponding to each target quantile to obtain a target quantile regression prediction model; and carrying out quantile regression prediction on each piece of predicted load data based on a target quantile regression prediction model to obtain a target load value corresponding to each target quantile so as to obtain a predicted result of load distribution.
According to the prediction device for the load distribution of the energy station, the characteristic data is obtained through extraction of the historical load data, then the load data corresponding to each time point in the future time period can be accurately predicted according to the characteristic data, and then the load value corresponding to each quantile in the future time period can be accurately predicted according to each load data, so that the prediction of the load distribution condition in the future time period is realized, the problem that the load distribution prediction result of the energy station is inaccurate in the prior art is solved, and the guarantee is provided for the follow-up policy reporting based on the reasonable and accurate prediction result.
Another embodiment of the present application provides a storage medium storing a computer program which, when executed by a processor, performs the method steps of:
step one, acquiring a plurality of historical load data and historical time corresponding to each historical load data;
step two, acquiring sample time characteristics corresponding to each historical time based on each historical time;
step three, performing characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
fourth, load data prediction is carried out on each target time based on a target characteristic data set corresponding to each historical load data, sample time characteristics of each historical time and target time characteristics of each target time in a time period to be predicted, and predicted load data corresponding to each target time is obtained;
fifthly, carrying out quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain target load values corresponding to each target quantile in a time period to be predicted so as to obtain a prediction result of load distribution.
The specific implementation process of the above method steps can refer to the embodiment of the above method for predicting load distribution of any energy station, and this embodiment is not repeated here.
According to the storage medium, the characteristic data are obtained through extraction of the historical load data, then the load data corresponding to each time point in the future time period can be accurately obtained through prediction according to the characteristic data, and the load value corresponding to each quantile in the future time period can be obtained through accurate prediction according to each load data.
Another embodiment of the present application provides an electronic device, as shown in fig. 3, at least including a memory 1 and a processor 2, where the memory 1 stores a computer program, and the processor 2 implements the following method steps when executing the computer program on the memory 1:
step one, acquiring a plurality of historical load data and historical time corresponding to each historical load data;
step two, acquiring sample time characteristics corresponding to each historical time based on each historical time;
step three, performing characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
Fourth, load data prediction is carried out on each target time based on a target characteristic data set corresponding to each historical load data, sample time characteristics of each historical time and target time characteristics of each target time in a time period to be predicted, and predicted load data corresponding to each target time is obtained;
fifthly, carrying out quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain target load values corresponding to each target quantile in a time period to be predicted so as to obtain a prediction result of load distribution.
The specific implementation process of the above method steps can refer to the embodiment of the above method for predicting load distribution of any energy station, and this embodiment is not repeated here.
According to the electronic equipment, the characteristic data are obtained through extraction of the historical load data, then the load data corresponding to each time point in the future time period can be accurately obtained through prediction according to the characteristic data, and the load value corresponding to each quantile in the future time period can be obtained through accurate prediction according to each load data.
The above embodiments are only exemplary embodiments of the present application and are not intended to limit the present application, the scope of which is defined by the claims. Various modifications and equivalent arrangements may be made to the present application by those skilled in the art, which modifications and equivalents are also considered to be within the scope of the present application.

Claims (10)

1. A method for predicting energy station load distribution, comprising:
acquiring a plurality of historical load data and historical time corresponding to each historical load data;
acquiring sample time characteristics corresponding to each historical time based on each historical time;
performing characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
based on the target characteristic data set corresponding to each historical load data, the sample time characteristic of each historical time and the target time characteristic of each target time in the time period to be predicted, carrying out load data prediction on each target time to obtain predicted load data corresponding to each target time;
and carrying out quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain a target load value corresponding to each target quantile in a time period to be predicted so as to obtain a prediction result of load distribution.
2. The method according to claim 1, wherein the performing feature computation on each historical load data to obtain a target sample feature data set corresponding to each historical load data, the target sample feature data set including a plurality of target feature types, specifically includes:
based on the preset feature types and a plurality of time ranges, different feature types corresponding to different time ranges under each feature type are determined so as to obtain a plurality of initial feature types;
performing feature calculation processing on each historical load data based on each initial feature type to obtain initial sample feature data corresponding to each historical load data under each initial feature type;
based on the characteristic data of each initial sample, screening and obtaining a plurality of first characteristic types from the initial characteristic types by utilizing a characteristic screening mode of characteristic countermeasure verification;
screening by utilizing a characteristic screening mode of characteristic importance based on the characteristic data of each initial sample corresponding to each first characteristic category so as to screen and obtain a target characteristic category from each first characteristic category;
and taking each initial sample characteristic data corresponding to each target characteristic type as target sample characteristic data to construct and obtain a target sample characteristic data set corresponding to each historical load data.
3. The method of claim 2, wherein the screening, based on the feature data of each initial sample, of the initial feature types to obtain a plurality of first feature types by using feature screening means of feature challenge verification specifically comprises:
dividing initial sample characteristic data corresponding to each historical load data to obtain a training set and a testing set, and configuring corresponding labels for initial sample characteristics in the training set and the testing set;
carrying out data combination on the training set and the test set, and carrying out classification model training based on the combined initial sample characteristic data and labels corresponding to the initial characteristic data to obtain a current classification model;
calculating a current AUC value based on the current classification model, and comparing the current AUC value with a preset AUC threshold;
and under the condition that the current AUC value is larger than the AUC threshold, calculating the feature importance corresponding to each initial feature type, deleting each initial sample feature corresponding to the initial feature type with the highest feature importance, and retraining the classification model until the current AUC value of the classification model obtained based on retraining is smaller than the preset AUC threshold, and taking each initial feature type which is not deleted as a first feature type.
4. The method of claim 2, wherein the selecting the target feature class from the first feature classes based on the initial sample feature data corresponding to the first feature classes by using a feature selection method of feature importance, specifically comprises:
respectively training a tree model aiming at each initial sample characteristic data corresponding to the same first characteristic category and historical load data corresponding to each initial sample characteristic data to obtain the importance of the original characteristic corresponding to each first characteristic category;
for each initial sample characteristic data corresponding to the same first characteristic type, carrying out scrambling processing on the corresponding relation between each initial sample characteristic data and the historical load data to obtain each historical load data and each initial sample characteristic data with the scrambled corresponding relation;
based on the history load data and the initial sample feature data with disturbed corresponding relation, training a tree model to obtain the importance of the current feature corresponding to each first feature type;
determining target feature importance of each first feature class based on the original feature importance and the current feature importance corresponding to the same first feature class;
And screening each first characteristic category based on the target characteristic importance corresponding to each first characteristic category to obtain a plurality of target characteristic categories.
5. The method of claim 1, wherein the feature types comprise: statistical features, hysteresis features, and differential features;
the statistical features include: variance, maximum, minimum, average;
the time profile includes: holiday characteristics, holiday characteristics.
6. The method of claim 1, wherein the predicting load data for each target time based on the target feature data set corresponding to each historical load data, the sample time feature of each historical time, and the target time feature of each target time in the to-be-predicted time period, and obtaining the predicted load data corresponding to each target time, specifically comprises:
training the initial point prediction model based on the target characteristic data set corresponding to each historical load data and the sample time characteristics of each historical time to obtain a target point prediction model;
and predicting and obtaining predicted load data corresponding to each target time by using the target point prediction model based on the target time characteristics of each target time.
7. The method of claim 1, wherein the performing a quantile prediction on each of the predicted load data by using a quantile regression prediction method to obtain a load value corresponding to each target quantile in a period to be predicted, so as to obtain a predicted result of load distribution, specifically includes:
determining sample load values corresponding to a plurality of target quantiles based on each historical load data;
training the initial quantile regression prediction model based on each historical load data and the sample load value corresponding to each target quantile to obtain a target quantile regression prediction model;
and carrying out quantile regression prediction on each piece of predicted load data based on a target quantile regression prediction model to obtain a target load value corresponding to each target quantile so as to obtain a predicted result of load distribution.
8. A predictive device for energy station load distribution, comprising:
the historical data acquisition module is used for acquiring a plurality of historical load data and the historical time corresponding to each historical load data;
the first characteristic acquisition module is used for acquiring sample time characteristics corresponding to each historical time based on each historical time point;
The second characteristic acquisition module is used for carrying out characteristic calculation processing on each historical load data to obtain a target sample characteristic data set which corresponds to each historical load data and contains a plurality of target characteristic types;
the first prediction module is used for predicting the load data of each target time based on a target characteristic data set corresponding to each historical load data, a sample time characteristic of each historical time and a target time characteristic of each target time in a time period to be predicted, and obtaining predicted load data corresponding to each target time;
and the second prediction module is used for carrying out quantile prediction on each piece of predicted load data by using a quantile regression prediction mode to obtain target load values corresponding to each target quantile in a time period to be predicted so as to obtain a prediction result of load distribution.
9. A storage medium storing a computer program which, when executed by a processor, carries out the steps of the method of predicting energy station load distribution according to any one of claims 1 to 7.
10. An electronic device comprising at least a memory, a processor, said memory having stored thereon a computer program, said processor, when executing the computer program on said memory, implementing the steps of the method for predicting energy station load distribution according to any one of the preceding claims 1-7.
CN202311499749.0A 2023-11-10 2023-11-10 Method, device, medium and equipment for predicting load distribution of energy station Pending CN117521897A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311499749.0A CN117521897A (en) 2023-11-10 2023-11-10 Method, device, medium and equipment for predicting load distribution of energy station

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311499749.0A CN117521897A (en) 2023-11-10 2023-11-10 Method, device, medium and equipment for predicting load distribution of energy station

Publications (1)

Publication Number Publication Date
CN117521897A true CN117521897A (en) 2024-02-06

Family

ID=89763827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311499749.0A Pending CN117521897A (en) 2023-11-10 2023-11-10 Method, device, medium and equipment for predicting load distribution of energy station

Country Status (1)

Country Link
CN (1) CN117521897A (en)

Similar Documents

Publication Publication Date Title
CN112699913B (en) Method and device for diagnosing abnormal relationship of household transformer in transformer area
Wang et al. Data-driven mode identification and unsupervised fault detection for nonlinear multimode processes
CN111199016A (en) DTW-based improved K-means daily load curve clustering method
CN111291783A (en) Intelligent fault diagnosis method, system, terminal and storage medium for gas pressure regulating equipment
CN111625516A (en) Method and device for detecting data state, computer equipment and storage medium
CN113438114B (en) Method, device, equipment and storage medium for monitoring running state of Internet system
CN115270986A (en) Data anomaly detection method and device and computer equipment
CN110794360A (en) Method and system for predicting fault of intelligent electric energy meter based on machine learning
WO2023207557A1 (en) Method and apparatus for evaluating robustness of service prediction model, and computing device
CN115004652B (en) Business wind control processing method and device, electronic equipment and storage medium
CN115221233A (en) Transformer substation multi-class live detection data anomaly detection method based on deep learning
CN116307059A (en) Power distribution network region fault prediction model construction method and device and electronic equipment
CN114707834A (en) Alarm reminding method and device and storage medium
CN114781532A (en) Evaluation method and device of machine learning model, computer equipment and medium
CN109800782A (en) A kind of electric network fault detection method and device based on fuzzy knn algorithm
CN117521897A (en) Method, device, medium and equipment for predicting load distribution of energy station
CN111080489A (en) Load trend prediction method, system and equipment
CN115905715A (en) Internet data analysis method and platform based on big data and artificial intelligence
CN116151799A (en) BP neural network-based distribution line multi-working-condition fault rate rapid assessment method
CN115410250A (en) Array type human face beauty prediction method, equipment and storage medium
CN114118680A (en) Network security situation assessment method and system
Saarinen Adaptive real-time anomaly detection for multi-dimensional streaming data
CN114330090A (en) Defect detection method and device, computer equipment and storage medium
CN111754103A (en) Enterprise risk image method, device, computer equipment and readable storage medium
CN112053219A (en) OCSVM (online charging management system VM) -based consumption financial fraud behavior detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination