CN116128296A - Risk prediction method, risk prediction device, electronic equipment and storage medium - Google Patents

Risk prediction method, risk prediction device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116128296A
CN116128296A CN202310008226.5A CN202310008226A CN116128296A CN 116128296 A CN116128296 A CN 116128296A CN 202310008226 A CN202310008226 A CN 202310008226A CN 116128296 A CN116128296 A CN 116128296A
Authority
CN
China
Prior art keywords
data
prediction
model
sample set
online rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310008226.5A
Other languages
Chinese (zh)
Inventor
陈凤超
邱泽坚
李祺威
苏俊妮
黄安平
张鑫
胡润锋
汪杰
段孟雍
何毅鹏
周立德
邓景柱
张锐
饶欢
刘沛林
萧嘉荣
黄达区
邵伟涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd, Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202310008226.5A priority Critical patent/CN116128296A/en
Publication of CN116128296A publication Critical patent/CN116128296A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The invention discloses a risk prediction method, a risk prediction device, electronic equipment and a storage medium. Acquiring historical online rate data in a first time period, and preprocessing the historical online rate data to obtain target online rate data; predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set; and determining a risk prediction grade of the second time period according to the online rate prediction data. The accuracy of predicting the risk prediction grade is improved.

Description

Risk prediction method, risk prediction device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer application technologies, and in particular, to a risk prediction method, a risk prediction device, an electronic device, and a storage medium.
Background
After a large amount of distributed energy sources are integrated into the power distribution network, the use frequency of a remote control switch in the operation of the power distribution network is obviously improved, and if a remote control command is not successfully executed by the remote control switch, the operation efficiency of the power distribution network is greatly influenced, and a certain risk is brought to the stable and safe operation of the power distribution network.
Currently, in order to prevent a security problem caused by a risk of operation of a distribution network, a risk level of operation of the distribution network is generally predicted. In the prior art, a dispatching controller (remote switch) is often selected as a main research object to perform risk dynamic perception on the operation of the distribution network, and a deep learning model is trained based on dispatching data of the remote switch so as to predict the risk level, but the accuracy of a prediction result is generally poor.
Disclosure of Invention
The invention provides a risk prediction method, a risk prediction device, electronic equipment and a storage medium, which are used for solving the technical problem of poor accuracy of risk prediction grade prediction.
According to an aspect of the present invention, there is provided a risk prediction method, wherein the method includes:
acquiring historical online rate data in a first time period, and preprocessing the historical online rate data to obtain target online rate data;
predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set;
and determining a risk prediction grade of the second time period according to the online rate prediction data.
According to another aspect of the present invention, there is provided a risk prediction apparatus, wherein the apparatus includes:
the data acquisition module is used for acquiring historical online rate data in a first time period, and preprocessing the historical online rate data to obtain target online rate data;
the data prediction module is used for predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set;
and the grade prediction module is used for determining the risk prediction grade of the second time period according to the online rate prediction data.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the risk prediction method of any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a risk prediction method according to any one of the embodiments of the present invention.
According to the technical scheme, the historical online rate data in the first time period are obtained, the historical online rate data are preprocessed to obtain the target online rate data, the obtained data are preprocessed to obtain the digitized target online rate data, so that the model can be conveniently identified and predicted, and the prediction efficiency and accuracy of the model on remote control prediction results are improved; predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set, and the online rate prediction data with high accuracy is obtained, so that the risk prediction grade can be predicted more accurately; and determining the risk prediction grade of the second time period according to the online rate prediction data, so that the accuracy of predicting the risk prediction grade is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a risk prediction method according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a risk prediction method according to a second embodiment of the present invention;
FIG. 3 is an overall flow chart of a risk prediction method provided in accordance with an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a risk prediction device according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing a risk prediction method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a risk prediction method according to an embodiment of the present invention, where the method may be applied to a situation of risk prediction of a distribution network, and the method may be performed by a risk prediction device, where the risk prediction device may be implemented in a form of hardware and/or software, and the risk prediction device may be configured in a computer. As shown in fig. 1, the method includes:
s110, acquiring historical online rate data of a first time period, and preprocessing the historical online rate data to obtain target online rate data.
The first period of time may be understood as a preset period of time for acquiring the historical online rate data. In the embodiment of the present invention, the first period may be preset according to a scene requirement, which is not specifically limited herein. Alternatively, the first period of time may be a previous week period of time of the predicted data.
The historical rate data may be understood as historical rate data. Alternatively, the historical rate data may be data that has an effect on predicting the rate prediction data. In the embodiment of the present invention, the historical online rate data may be preset according to a scene requirement, which is not specifically limited herein. By way of example, the historical presence data may include controller scheduling information, scheduling controller failure information, scheduling controller to terminal mapping information, terminal presence data, terminal location information, and the like. It is further understood that the terminal presence data may be based on the presence time of each terminal divided by the total time. For example, if the online time of a terminal is 90h and the total time is 100h, the online rate of the terminal is 90%.
And preprocessing the historical online rate data by the target online rate data to obtain data. In the embodiment of the invention, the historical online rate data is preprocessed, so that the obtained target online rate data can be standardized and digitized, and the efficiency of the online rate prediction model for predicting the target online rate data is higher.
S120, predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set.
Wherein the linear rate prediction model may be understood as a model for predicting the target linear rate data of the first period of time inputted.
The second time period may be understood as a time period in which the presence of the presence prediction data may be understood. Alternatively, the second period of time may be a period of time of day subsequent to the first period of time.
The online rate prediction data can be understood as that the target online rate data of the first time period is input through a trained online rate prediction model to be predicted, so as to obtain predicted data. Alternatively, the presence rate prediction data may be predicted presence rate data of the terminal for the second period of time. Illustratively, the linear rate prediction data may be 80%, 90%, 95%, etc.
The training sample set may be understood as a sample set for training the random forest model. It is to be appreciated that the data in the training sample set may be data of the same data type as the target rate data.
The random forest model may be understood as a classification model for training as the linear prediction model. It is to be understood that the random forest is an extended variant of the guided aggregation algorithm (Bootstrap aggregating, bagging). The random forest algorithm is based on the fact that a decision tree is used as a base learner to construct Bagging integration, and random attribute selection is further added in the training process of the decision tree. Specifically, when selecting the partition attribute, the conventional decision tree selects an optimal attribute from all candidate attributes (assuming d) of the current node; in the random forest algorithm, for each node of the base decision tree, a subset containing k attributes is randomly selected from the candidate attribute set of the node, and then an optimal attribute is selected from the subset for division. Thus, the "diversity" of the basis learner of the random forest results from not only the perturbation of the sample, but also from the perturbation of the attribute, so that the generalization capability of the final integration is further enhanced. The random forest is characterized in that: the individual learner is a decision tree; sampling a training sample; the attributes are randomly sampled.
S130, determining a risk prediction grade of the second time period according to the online rate prediction data.
The risk prediction level may be understood as a predicted risk level of the distribution network operation in the second period according to the online rate prediction data. Optionally, the determining the risk prediction grade of the second period according to the online rate prediction data includes:
and determining the risk prediction grade of the second time period according to the online rate prediction data and a preset risk threshold.
The preset risk threshold may be understood as a threshold for determining a risk prediction level corresponding to the online rate prediction data. In the embodiment of the present invention, the preset risk threshold may be preset according to a scene requirement, which is not specifically limited herein. Alternatively, the preset risk threshold may be 20%, 50% and 80%. Specifically, in the case that the online rate prediction data does not exceed 20%, determining the risk prediction grade as risk-free; determining the risk prediction grade as low risk if the online rate prediction data exceeds 20% and does not exceed 50%; determining the risk prediction rating as a risk in the event that the online rate prediction data exceeds 50%, and does not exceed 80%; in the case where the linear rate prediction data exceeds 80%, the risk prediction rank is determined as high risk.
According to the technical scheme, the historical online rate data in the first time period are obtained, the historical online rate data are preprocessed to obtain the target online rate data, the obtained data are preprocessed to obtain the digitized target online rate data, so that the model can be conveniently identified and predicted, and the prediction efficiency and accuracy of the model on remote control prediction results are improved; predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set, and the online rate prediction data with high accuracy is obtained, so that the risk prediction grade can be predicted more accurately; and determining the risk prediction grade of the second time period according to the online rate prediction data, so that the accuracy of predicting the risk prediction grade is improved.
Example two
Fig. 2 is a flowchart of a risk prediction method according to a second embodiment of the present invention, where the target online rate data of the first time period is predicted by the trained online rate prediction model in the above embodiment, and online rate prediction data of a second time period is obtained and added. As shown in fig. 2, the method includes:
and S210, training the random forest model through the training sample set to obtain a preliminary prediction model.
The preliminary prediction model trains the random forest model through the training sample set to obtain a model.
Optionally, before training the random forest model through the training sample set to obtain a preliminary prediction model, the method includes:
acquiring an original linear rate sample set, and performing data cleaning on the original linear rate sample set to obtain a first sample set;
the method comprises the steps of extracting preset fields from a first sample set, and determining corresponding terminal characteristic values to obtain a sample data set;
dividing the sample data set based on a preset proportion to obtain the training sample set and the test sample set.
Wherein the original set of rate samples may be understood as a set of original rate sample data. It is appreciated that in embodiments of the present invention, the data of the original set of linear rate samples may be data of the same data type as the historical linear rate data.
The first sample set may be understood as a sample set obtained by performing data cleaning on the original linear rate sample set. Optionally, the performing data cleaning on the original linear rate sample set to obtain a first sample set includes:
extracting the identifiers of the terminal and the scheduling controller in the original linear rate sample set to obtain a terminal identifier and a scheduling controller identifier;
and carrying out mapping relation matching on the scheduling controller and the terminal based on the terminal identification and the scheduling controller identification, and cleaning data without mapping relation to obtain a first sample set.
Wherein the terminal identification may be understood as a marking for identifying the terminal. Alternatively, the terminal identifier may be a terminal number or a terminal ID, etc. The schedule controller identification may be understood as a marking identifying the schedule controller. Alternatively, the schedule controller identifier may be a terminal controller number or a terminal controller ID, etc.
The preset field may be understood as a field in the first set of samples that has an effect on training the random forest model. In the embodiment of the present invention, the preset field may be set according to a scene requirement, which is not specifically limited herein. Optionally, the preset field may be controller scheduling information, terminal location information, terminal presence data, or scheduling controller failure information. The preset field may be, for example, a local city, a county, a communication mode, a communication protocol, a terminal type, an operator, and the like.
The terminal characteristic value can be understood as a corresponding characteristic value obtained by carrying out standardization and digitalization processing on the preset field.
The sample data set may be understood as a sample set obtained by preprocessing the original linear-rate sample set. It is understood that the data in the sample data set may be data of the same data type as the target rate data. The sample data set includes the training sample set and the test sample set.
The preset ratio may be understood as a ratio for dividing the sample data set into the training sample set and the test sample set. In the embodiment of the present invention, the preset ratio may be preset according to the scene requirement, which is not specifically limited herein. Alternatively, the preset ratio may be 2:1.
The test sample set may be understood as a sample set for testing the preliminary prediction model.
Specifically, the original linear rate sample set is obtained from the data in the air, each terminal identifier and each scheduling controller identifier in the original linear rate sample set are extracted, mapping relation matching is carried out on the terminal and each scheduling controller, and data without mapping relation is cleaned to obtain a first sample set; further, the preset fields in the first sample set, which have an influence on training the random forest model, are extracted, standardized and digitized, and corresponding terminal characteristic values and sample data sets are obtained.
Optionally, the training the random forest model through the training sample set to obtain a preliminary prediction model includes:
determining the first time period and the second time period corresponding to the prediction period based on a preset prediction period;
aiming at the training sample set, training the random forest model based on a prediction period by taking the data of the first time period of the preset period as input data and the data of the second time period of the preset period as a label to obtain a preliminary prediction model.
Wherein the prediction period may be understood as a data period for training the random forest model. In the embodiment of the present invention, the preset period may be preset according to a scene requirement, which is not specifically limited herein. Alternatively, the prediction period may be eight days. Further, the prediction period may be divided into the first period and the second period. Alternatively, the first period may be a first seven day period of the preset period; the second period of time may be an eighth day period of time of the preset period.
Specifically, in the training process of the random forest model based on the training sample set, terminal online rate data of the first time period of the preset period and related terminal characteristic values are used as input data, and terminal online rate data of the second time period of the preset period is used as a label to adjust model parameters of the random forest model so as to obtain the preliminary prediction model.
S220, testing the preliminary prediction model through the test sample set, and taking the preliminary prediction model as the online rate prediction model under the condition that preset conditions are met.
The preset condition may be understood as a condition for judging whether the preliminary prediction model can be used as the linear prediction model. In the embodiment of the present invention, the preset conditions may be preset according to the scene requirement, which is not specifically limited herein.
Optionally, the testing the preliminary prediction model through the test sample set, and taking the preliminary prediction model as the online rate prediction model when a preset condition is met, includes:
testing the preliminary prediction model through the test sample set to obtain a model evaluation index corresponding to the preliminary prediction model, wherein the model evaluation index comprises an accuracy rate, an average absolute error value and a mean square error value;
and taking the preliminary prediction model as the online rate prediction model under the condition that the model evaluation index corresponding to the preliminary prediction model meets the preset condition.
Wherein the model evaluation index may be understood as an index for evaluating the preliminary prediction model. Optionally, the model evaluation index may include: accuracy, mean absolute error value, and mean squared error value. The accuracy rate can be understood as the probability of accuracy of prediction of the preliminary prediction model.
Specifically, the formula for calculating the average absolute error value of the preliminary prediction model may be:
Figure BDA0004036486750000101
wherein MAE (X, h) represents the average absolute error value, m represents the number of instances in the test sample set, X (i) Vector representing all eigenvalues of the ith instance in the test sample set, y (i) Representing the labels, X represents a matrix of all eigenvalues for all instances in the test sample set, and h represents the predictive function of the system.
Specifically, the formula for calculating the mean square error value of the preliminary prediction model may be:
Figure BDA0004036486750000102
where RMSE (X, h) represents the mean square error value, m represents the number of instances in the test sample set, X (i) Representing a testVector of all eigenvalues of the i-th instance in the sample set, y (i) Representing the labels, X represents a matrix of all eigenvalues for all instances in the test sample set, and h represents the predictive function of the system.
In the embodiment of the invention, the random forest model can be respectively trained based on the regional county division sample data set so as to obtain the online rate prediction model corresponding to each regional county.
S230, acquiring historical online rate data in a first time period, and preprocessing the historical online rate data to obtain target online rate data.
S240, predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set.
S250, determining a risk prediction grade of the second time period according to the online rate prediction data.
According to the technical scheme, the random forest model is trained through the training sample set, and a preliminary prediction model is obtained; and testing the preliminary prediction model through the test sample set, and taking the preliminary prediction model as the online rate prediction model under the condition that a preset condition is met. The effect of improving the accuracy of the linear prediction model is achieved.
Fig. 3 is an overall flowchart of a risk prediction method according to an embodiment of the present invention, and as shown in fig. 3, the overall flow of the risk prediction method may be:
1. and (5) inputting an original linear rate sample set, and cleaning mapping relation-free data. And extracting an original linear rate sample set of the scheduling terminal from the database, performing one-to-one matching according to the mapping relation between the scheduling controller and the terminal, and eliminating the data without the mapping relation.
2. And extracting the characteristic value of the terminal. Processing and extracting terminal characteristic values required by prediction from preset fields in data, wherein the terminal characteristic values comprise: district and city office, district and county office, communication mode, communication protocol, terminal type, operator, etc.
3. The data is divided according to the prediction period. And dividing the online rate data of the terminal according to the requirements according to a data format with a sliding window of 1 day after predicting for 7 consecutive days.
4. Dividing a sample data set, training a model, and testing the performance of the sample set and the model. According to 2:1, dividing a training sample set and a testing sample set according to the proportion, putting the training sample set and the testing sample set into a random forest model according to different county offices for training, and testing on the corresponding testing sample set after obtaining the training model.
5. A risk prediction level is determined. And evaluating the risk level according to the output online rate.
6. And supplementing the newly added data. And continuously importing a historical online rate data training model to realize the self-learning of risk knowledge.
According to the invention, after the original online rate data of the dispatching terminal is received, the terminal number in the data is extracted and matched with the dispatching controller in a mapping relation, then the data is divided after processing, a training sample set and a testing sample set are divided, the training sample set is used for training a model, and then risk grade assessment is carried out on the output online rate, so that online risk perception is predicted according to the original online rate data.
According to the invention, according to the original online rate data, the mapping relation between the terminal and the controller is combined, the characteristic values of the related terminals are extracted for processing, the data processing is carried out according to the prediction period of the requirement, the original online rate data and the characteristic value data of the terminal are combined, the data are input into a machine learning algorithm model for training, the random forest algorithm with the best effect is selected for prediction, and the extraction and conjunctive relation of evidence factors is analyzed. The method is favorable for clearing the generation mechanism of the dispatching risk of the distribution network by using a mode of combining a theoretical physical model and a data driving model. And the risk is better perceived and early-warned from the terminal side.
Example III
Fig. 4 is a schematic structural diagram of a risk prediction apparatus according to a third embodiment of the present invention. As shown in fig. 4, the apparatus includes: a data acquisition module 310, a data prediction module 320, and a rank prediction module 330.
The data acquisition module 310 is configured to acquire historical online rate data in a first period, and perform preprocessing on the historical online rate data to obtain target online rate data;
the data prediction module 320 is configured to predict, by using a trained linear prediction model, the input target linear prediction data in the first period of time to obtain linear prediction data in a second period of time, where the linear prediction model is obtained by training a random forest model through a training sample set;
and the level prediction module 330 is configured to determine a risk prediction level of the second period according to the online rate prediction data.
According to the technical scheme, the historical online rate data in the first time period are obtained, the historical online rate data are preprocessed to obtain the target online rate data, the obtained data are preprocessed to obtain the digitized target online rate data, so that the model can be conveniently identified and predicted, and the prediction efficiency and accuracy of the model on remote control prediction results are improved; predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set, and the online rate prediction data with high accuracy is obtained, so that the risk prediction grade can be predicted more accurately; and determining the risk prediction grade of the second time period according to the online rate prediction data, so that the accuracy of predicting the risk prediction grade is improved.
Optionally, the level prediction module 330 is configured to:
and determining the risk prediction grade of the second time period according to the online rate prediction data and a preset risk threshold.
Optionally, the risk prediction device further includes: the model training module and the model testing module.
The model training module is used for training the random forest model through the training sample set before the input target linear rate data in the first time period is predicted through the trained linear rate prediction model to obtain linear rate prediction data in the second time period, so as to obtain a preliminary prediction model;
the model test module is used for testing the preliminary prediction model through the test sample set, and taking the preliminary prediction model as the online rate prediction model under the condition that the preset condition is met.
Optionally, the model test module is configured to:
testing the preliminary prediction model through the test sample set to obtain a model evaluation index corresponding to the preliminary prediction model, wherein the model evaluation index comprises an accuracy rate, an average absolute error value and a mean square error value;
and taking the preliminary prediction model as the online rate prediction model under the condition that the model evaluation index corresponding to the preliminary prediction model meets the preset condition.
Optionally, the risk prediction device further includes: the sample acquisition module and the data cleaning module are used for dividing the samples.
The sample acquisition module is used for acquiring an original linear rate sample set before training the random forest model through the training sample set to obtain a preliminary prediction model, and performing data cleaning on the original linear rate sample set to obtain a first sample set;
the data cleaning module is used for determining a corresponding terminal characteristic value by carrying out preset field extraction on the first sample set so as to obtain a sample data set;
the sample dividing module is used for dividing the sample data set based on a preset proportion to obtain the training sample set and the test sample set.
Optionally, the data cleaning module is configured to:
extracting the identifiers of the terminal and the scheduling controller in the original sample set to obtain a terminal identifier and a scheduling controller identifier;
and carrying out mapping relation matching on the scheduling controller and the terminal based on the terminal identification and the scheduling controller identification, and cleaning data without mapping relation to obtain a first sample set.
Optionally, the model training module is configured to:
determining the first time period and the second time period corresponding to the prediction period based on a preset prediction period;
aiming at the training sample set, training the random forest model based on a prediction period by taking the data of the first time period of the preset period as input data and the data of the second time period of the preset period as a label to obtain a preliminary prediction model.
The risk prediction device provided by the embodiment of the invention can execute the risk prediction method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the risk prediction method.
In some embodiments, the risk prediction method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the risk prediction method described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the risk prediction method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A risk prediction method, comprising:
acquiring historical online rate data in a first time period, and preprocessing the historical online rate data to obtain target online rate data;
predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set;
and determining a risk prediction grade of the second time period according to the online rate prediction data.
2. The method of claim 1, wherein determining a risk prediction rating for the second time period based on the online rate prediction data comprises:
and determining the risk prediction grade of the second time period according to the online rate prediction data and a preset risk threshold.
3. The method of claim 1, comprising, prior to predicting the target rate data for the first time period entered by a trained rate prediction model to obtain rate prediction data for a second time period:
training the random forest model through the training sample set to obtain a preliminary prediction model;
and testing the preliminary prediction model through the test sample set, and taking the preliminary prediction model as the online rate prediction model under the condition that a preset condition is met.
4. A method according to claim 3, wherein said testing the preliminary prediction model by the test sample set, in case a preset condition is satisfied, takes the preliminary prediction model as the linear prediction model, comprises:
testing the preliminary prediction model through the test sample set to obtain a model evaluation index corresponding to the preliminary prediction model, wherein the model evaluation index comprises an accuracy rate, an average absolute error value and a mean square error value;
and taking the preliminary prediction model as the online rate prediction model under the condition that the model evaluation index corresponding to the preliminary prediction model meets the preset condition.
5. A method according to claim 3, comprising, prior to training the random forest model by the training sample set to obtain a preliminary predictive model:
acquiring an original linear rate sample set, and performing data cleaning on the original linear rate sample set to obtain a first sample set;
the method comprises the steps of extracting preset fields from a first sample set, and determining corresponding terminal characteristic values to obtain a sample data set;
dividing the sample data set based on a preset proportion to obtain the training sample set and the test sample set.
6. The method of claim 5, wherein the performing data cleaning on the original set of linear rate samples to obtain a first set of samples comprises:
extracting the identifiers of the terminal and the scheduling controller in the original linear rate sample set to obtain a terminal identifier and a scheduling controller identifier;
and carrying out mapping relation matching on the scheduling controller and the terminal based on the terminal identification and the scheduling controller identification, and cleaning data without mapping relation to obtain a first sample set.
7. A method according to claim 3, wherein training the random forest model by the training sample set to obtain a preliminary prediction model comprises:
determining the first time period and the second time period corresponding to the prediction period based on a preset prediction period;
aiming at the training sample set, training the random forest model based on a prediction period by taking the data of the first time period of the preset period as input data and the data of the second time period of the preset period as a label to obtain a preliminary prediction model.
8. A risk prediction apparatus, comprising:
the data acquisition module is used for acquiring historical online rate data in a first time period, and preprocessing the historical online rate data to obtain target online rate data;
the data prediction module is used for predicting the input target online rate data in the first time period through a trained online rate prediction model to obtain online rate prediction data in a second time period, wherein the online rate prediction model is obtained by training a random forest model through a training sample set;
and the grade prediction module is used for determining the risk prediction grade of the second time period according to the online rate prediction data.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the risk prediction method of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to perform the risk prediction method of any one of claims 1-7.
CN202310008226.5A 2023-01-04 2023-01-04 Risk prediction method, risk prediction device, electronic equipment and storage medium Pending CN116128296A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310008226.5A CN116128296A (en) 2023-01-04 2023-01-04 Risk prediction method, risk prediction device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310008226.5A CN116128296A (en) 2023-01-04 2023-01-04 Risk prediction method, risk prediction device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116128296A true CN116128296A (en) 2023-05-16

Family

ID=86300378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310008226.5A Pending CN116128296A (en) 2023-01-04 2023-01-04 Risk prediction method, risk prediction device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116128296A (en)

Similar Documents

Publication Publication Date Title
US20230186607A1 (en) Multi-task identification method, training method, electronic device, and storage medium
CN114841619A (en) State evaluation method and device for isolating switch, electronic equipment and medium
CN115346171A (en) Power transmission line monitoring method, device, equipment and storage medium
CN115794578A (en) Data management method, device, equipment and medium for power system
CN113904943B (en) Account detection method and device, electronic equipment and storage medium
CN116957539A (en) Cable state evaluation method, device, electronic equipment and storage medium
CN116937645A (en) Charging station cluster regulation potential evaluation method, device, equipment and medium
CN116755974A (en) Cloud computing platform operation and maintenance method and device, electronic equipment and storage medium
CN115563507A (en) Generation method, device and equipment for renewable energy power generation scene
CN116128296A (en) Risk prediction method, risk prediction device, electronic equipment and storage medium
CN115034927A (en) Data processing method and device, electronic equipment and storage medium
CN114996930A (en) Modeling method and device, electronic equipment and storage medium
CN113554062A (en) Training method, device and storage medium of multi-classification model
CN116186536A (en) Risk prediction method, risk prediction device, electronic equipment and storage medium
CN116186549B (en) Model training method, device, equipment and medium
CN116628167B (en) Response determination method and device, electronic equipment and storage medium
CN116627695B (en) Alarm event root cause recommendation method, device, equipment and storage medium
CN116431809A (en) Text labeling method, device and storage medium based on bank customer service scene
CN117194479A (en) Element parameter meaning determining method, device, equipment and storage medium
CN116017401A (en) Stay point determining method and device, electronic equipment and storage medium
CN117216633A (en) Parameter analysis model training and parameter analysis method, device, equipment and medium
CN115392399A (en) Method, device, equipment and medium for training and using process timeout prediction model
CN115564329A (en) Typical capacity scene determining method, device, equipment and storage medium
CN116822740A (en) Power distribution network operation and maintenance scheme determining method and device, electronic equipment and storage medium
CN117609862A (en) Power grid data anomaly level determination method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination