CN112614011A - Power distribution network material demand prediction method and device, storage medium and electronic equipment - Google Patents

Power distribution network material demand prediction method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN112614011A
CN112614011A CN202011419119.4A CN202011419119A CN112614011A CN 112614011 A CN112614011 A CN 112614011A CN 202011419119 A CN202011419119 A CN 202011419119A CN 112614011 A CN112614011 A CN 112614011A
Authority
CN
China
Prior art keywords
data
distribution network
preprocessed
data set
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011419119.4A
Other languages
Chinese (zh)
Other versions
CN112614011B (en
Inventor
陆斯悦
徐蕙
陈平
王艳松
李香龙
张禄
王培祎
盛慧慧
严嘉慧
马龙飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Beijing Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Beijing Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Beijing Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN202011419119.4A priority Critical patent/CN112614011B/en
Publication of CN112614011A publication Critical patent/CN112614011A/en
Application granted granted Critical
Publication of CN112614011B publication Critical patent/CN112614011B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Resources & Organizations (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Strategic Management (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Operations Research (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Supply And Distribution Of Alternating Current (AREA)

Abstract

The invention discloses a method and a device for predicting material demand of a power distribution network, a storage medium and electronic equipment. The method comprises the following steps: preprocessing the use recording parameters of historical distribution network materials of the target area, historical distribution network planning data and economic development data of the distribution network area to obtain a preprocessed data set of the distribution network materials of the target area; dividing the preprocessed data set to obtain preprocessed sub data sets; clustering the preprocessed sub data set to obtain a data cluster; inputting the data clustering cluster into a material estimation model to output estimated material usage required by operating a power distribution network construction material project in a target area; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data. The method solves the technical problem of large subjective judgment error caused by manual prediction of the traditional distribution network materials.

Description

Power distribution network material demand prediction method and device, storage medium and electronic equipment
Technical Field
The invention relates to the technical field of electric power material estimation, in particular to a method and a device for predicting material requirements of an electric power distribution network, a storage medium and electronic equipment.
Background
With the development of the internet of things, the construction and maintenance of a distribution network are one of important works of a power system, the related project of the distribution network is complex, and a large amount of materials are consumed in implementation. Through predicting the distribution network material demand, the demand total amount of various materials in the electric power construction project in the future time period is determined, reasonable material purchasing and transferring are carried out, the influence on the distribution network engineering caused by insufficient materials can be avoided, and simultaneously the material backlog risk and fund waste caused by excessive purchasing are reduced.
The traditional distribution network material demand method is further improved by adopting a manual mode and combining a project report with on-site investigation, but the distribution network material demand prediction result is subjectively influenced by a decision maker, the error between the prediction result and the actual time is often larger, and the prediction efficiency is lower.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for predicting the material demand of a power distribution network, a storage medium and electronic equipment, which at least solve the technical problem of large subjective judgment error caused by the traditional manual prediction of the material of the distribution network.
According to an aspect of the embodiment of the present invention, a method for predicting demand for materials of a power distribution network is provided, including: preprocessing the use recording parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area to obtain a preprocessed data set of the distribution network materials of the target area; dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; clustering the preprocessed sub data set to obtain a data cluster; inputting the data clustering cluster into a material estimation model to output estimated material usage required by the target area to operate the power distribution network construction material project; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by training sample material data for multiple times.
According to another aspect of the embodiments of the present invention, there is also provided a device for predicting demand for materials in a power distribution network, including: the system comprises a preprocessing unit, a data processing unit and a data processing unit, wherein the preprocessing unit is used for preprocessing usage recording parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area to obtain a preprocessing data set of the distribution network materials of the target area; the dividing unit is used for dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; a clustering unit, configured to cluster the preprocessed sub data sets to obtain data cluster clusters; the estimation unit is used for inputting the data clustering cluster into a material estimation model so as to output estimated material usage required by the power distribution network construction material project operated in the target area; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by training sample material data for multiple times.
According to another aspect of the embodiment of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the above method for predicting the material demand of the power distribution network when running.
According to another aspect of the embodiment of the present invention, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the above method for predicting demand of materials in a power distribution network through the computer program.
In the embodiment of the invention, the use record parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area are preprocessed to obtain a preprocessed data set of the distribution network materials of the target area; dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; clustering the preprocessed sub data set to obtain a data cluster; the data clustering is input into a material prediction model to output a prediction material usage mode required by the power distribution network construction material project operated in the target area, and the purpose of reducing errors of distribution network material prediction results and time reality is achieved, so that the technical effects of improving the accuracy of distribution network material prediction results, shortening prediction time and improving the working efficiency of material management departments are achieved, and the technical problem of large subjective judgment errors caused by traditional distribution network material manual prediction is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a schematic diagram of an application environment of an alternative power distribution network material demand prediction method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of an alternative method for predicting demand for materials of a power distribution network according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of another alternative power distribution network material demand prediction method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an alternative power distribution network material demand prediction apparatus according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an alternative electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to an aspect of the embodiment of the present invention, a method for predicting demand of materials for a power distribution network is provided, and optionally, as an optional implementation manner, the method for predicting demand of materials for a power distribution network may be, but is not limited to, applied to an environment as shown in fig. 1.
In fig. 1, the terminal device 104 is responsible for human-computer interaction with the user 102, and the terminal device 104 includes a memory 106, a processor 108 and a display 110; terminal device 104 may interact with server 114 via network 112. Server 114 includes database 116 and processing engine 118; the terminal device 104 may send the usage record parameters of the historical distribution network materials, the historical distribution network planning data, and the economic development data of the distribution network region to the server 114 through the network 112, and the server 114 outputs the estimated material usage amount required by the power distribution network construction material project operated in the target region, and sends the estimated material usage amount to the terminal device 104.
Optionally, in this embodiment, the terminal device 104 may be a terminal device configured with a target client, and may include but is not limited to at least one of the following: mobile phones (such as Android phones, iOS phones, etc.), notebook computers, tablet computers, palm computers, MID (Mobile Internet Devices), PAD, desktop computers, smart televisions, etc. The target client may be a video client, an instant messaging client, a browser client, an educational client, etc. The network 112 may include, but is not limited to: a wired network, a wireless network, wherein the wired network comprises: a local area network, a metropolitan area network, and a wide area network, the wireless network comprising: bluetooth, WIFI, and other networks that enable wireless communication. The server 114 may be a single server, a server cluster composed of a plurality of servers, or a cloud server. The above is merely an example, and this is not limited in this embodiment.
The construction and maintenance of the distribution network are one of important works of a power system, related projects of the distribution network are complex, a large amount of materials are consumed in implementation, the traditional distribution network material demand method adopts a manual mode for prediction, but the prediction result error is large, and a distribution network material demand prediction method based on mathematical modeling appears in the related technology. According to the method, historical data related to the distribution network material demand is screened and preprocessed, then the historical data is led into a mathematical model for training, and then the trained model is used for distribution network material demand prediction. However, these methods still have much room for improvement in practical applications. In the aspect of a prediction algorithm, a prediction model with excellent effect can be realized through a neural network, but a large amount of preprocessing data and long training time are required, and the data volume and application conditions in actual engineering are probably not allowed; the auto-regressive Moving Average Model (ARIMA) algorithm has high requirements on data quality, and essentially only can fit a linear Model, but cannot well capture the nonlinear relation between the material demand and the influence factors. In the aspect of data, the methods generally only use data of material consumption of a power grid company, do not consider other influence factors, and have low estimation accuracy on material requirements of a power distribution network.
Based on the technical problem, optionally, as an optional implementation manner, as shown in fig. 2, the method for predicting demand for materials of a power distribution network includes:
s202, preprocessing the use record parameters of historical distribution network materials of the target area, distribution network historical planning data and distribution network area economic development data to obtain a preprocessing data set of the distribution network materials of the target area;
s204, dividing the preprocessed data set to obtain preprocessed sub-data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
s206, clustering the preprocessed sub data set to obtain a data cluster;
s208, inputting the data clustering cluster into a material estimation model to output estimated material usage required by the power distribution network construction material project operated in a target region; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data.
In step S202, during actual application, but not limited to, historical distribution network material usage recording parameters and power network historical investment planning data of a power grid company in a target area may be used as model training data, and a historical major economic development index of an area covered by a distribution network is added as characteristic data, so that a power distribution material prediction result is more accurate. Because the use amount of distribution network materials has a direct relation with the power grid planning, and the power grid planning is influenced by local economic development and population quantity. Therefore, the investment planning data, the economic development index and the population index of the power grid are important influence factors for distribution network material demand prediction and are used as influence data for distribution network material demand prediction. Therefore, the historical distribution network material use record of the power grid company and the historical investment planning data of the power grid are used as model training data, so that the predicted data are accessed more sufficiently, and the historical main economic development indexes of the area covered by the distribution network are added as characteristic data, so that the prediction result is more accurate.
Here, the historical distribution network material usage recording parameters, the historical investment planning data of the power grid, and the economic development data of the distribution network region may be in the form of a data table or in the form of an XLM format document, which is not limited herein. Preprocessing the historical distribution network material use recording parameters, the power grid annual investment planning data and the distribution network area economic development data to obtain a preprocessing data set of the distribution network materials of a target area; the preprocessing data set obtaining comprises the step of sequencing historical distribution network material use recording parameters, power grid historical investment planning data and distribution network area economic development data according to dates to form a data summary table, wherein the data summary table can include but is not limited to an EXCLE data table. The data missing and abnormal conditions existing in the data summary table need to be checked and processed, here, a mean value method can be adopted to fill missing data, or a method from a median to the position of the missing data is filled, data smoothing is carried out on a data spike phenomenon occurring in the data summary table, and a preprocessing data set of the distribution network materials in a target area is further obtained.
In step S204, in actual application, the preprocessed data set is divided to obtain preprocessed sub-data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; here, the attribute of the power distribution network construction material project may be a project type, a project name, a project department, a warehouse-out material type, a warehouse-out material name, and the like of the power construction, and is not limited herein. That is, for example, the preprocessed data set may be divided according to the different attributes to obtain a plurality of EXCLE data tables with the project type, the project name, the project department, the ex-warehouse material type, and the ex-warehouse material of the power construction as the header type.
In step S206, in actual application, the preprocessed sub data sets are clustered to obtain a data cluster, for example, a plurality of EXCLE data tables with the project type, the project name, the project department, the ex-warehouse material type, and the ex-warehouse material of the power construction as the header type. And clustering the data items according to the cosine distance between the table head items as a distance measurement standard to form a plurality of cluster clusters, and obtaining the data cluster clusters with the highest similarity, such as the attributes of the power distribution network construction material project.
In step S208, in practical application, the material prediction model may use, but is not limited to, an eXtreme Gradient boost (XGBoost) prediction model, the XGBoost prediction model has a fast training speed, a low data quality requirement, and a good non-linear factor capture performance, and can significantly reduce the complexity of data preprocessing, and has obvious advantages in practical engineering. The XGboost prediction model is a tree integration model, and the operation mode is that the CART tree with the total amount of K in the model predicts the same data set, and the prediction results are summed to serve as the final prediction value. Namely:
Figure BDA0002821476550000071
in the formula (I), the compound is shown in the specification,
Figure BDA0002821476550000072
as a data set xiCorresponding prediction results; f. ofkA prediction model for the kth tree.
In the XGboost prediction model training process, each tree model adopts the same form of objective function as a model precision evaluation index:
Figure BDA0002821476550000073
in the formula (I), the compound is shown in the specification,
Figure BDA0002821476550000074
representing the fitting precision of the model to the training set for a loss function;
Figure BDA0002821476550000075
too high a model complexity, which is a complexity function, will result in an overfitting phenomenon.
The square loss function can be used as the loss function in this embodiment, as follows:
Figure BDA0002821476550000076
for a single CART tree in the XGboost prediction model, the complexity function is defined as:
Figure BDA0002821476550000077
wherein T is the total leaf node number of the model, omega2The L2 norm which is the weight vector for each leaf node; and gamma and lambda are used as adjustable penalty coefficients to adjust the complexity of the XGboost prediction model.
Optionally, but not limited to, the preprocessing data set is used as a training set, all the partitioned preprocessing data subsets include complete time, power material subclass scheduling information, various regional economic development indexes and other characteristic information, and the XGBoost prediction model is used for training the material demand prediction model. And adjusting model parameters by adopting a common grid searching mode of machine learning, adjusting parameters item by item, and finally determining the optimal parameter combination of the electric power materials.
In addition, the performance of the XGboost prediction model can be judged. When the model training reaches the optimal or set requirement, taking the model parameter with the optimal objective function value as the final result; otherwise, the model parameters are adjusted and retrained. After the XGboost prediction model is trained successfully, all parameters are stored in a server database to be used as backups, and the XGboost prediction model can be called and used by a web terminal or other applications. After the XGboost prediction model is operated for a period of time regularly, a training data set updated in a rolling mode is read from a server database, and the regression prediction model is retrained so as to guarantee the real-time performance of the XGboost prediction model in data calculation.
According to the historical demand condition of distribution network materials and the planning data of a power grid in the past year, the main economic development indexes of the distribution network coverage area in the past year are combined to serve as characteristics, parallel training is carried out by adopting an XGboost prediction model integrated learning algorithm, a high-precision power distribution network material demand prediction model is established, the power grid material department is assisted to carry out material purchasing and material scheduling work, the risk of material shortage and material backlog is reduced, and enterprise funds are saved.
In the embodiment of the invention, the use record parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area are preprocessed to obtain a preprocessed data set of the distribution network materials of the target area; dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; clustering the preprocessed sub data set to obtain a data cluster; the data clustering is input into a material prediction model to output a prediction material usage mode required by the power distribution network construction material project operated in the target area, and the purpose of reducing errors of distribution network material prediction results and time reality is achieved, so that the technical effects of improving the accuracy of distribution network material prediction results, shortening prediction time and improving the working efficiency of material management departments are achieved, and the technical problem of large subjective judgment errors caused by traditional distribution network material manual prediction is solved.
In one embodiment, step S202 includes: sequencing the obtained use record parameters of historical distribution network materials, distribution network historical planning data and distribution network area economic development data according to the generation date respectively, and summarizing to obtain a distribution network material data summary table; and processing abnormal data in the distribution network material data summary table according to a preset data processing method to obtain a preprocessed data set. Here, the date span is freely defined as the number of days, months, years, etc. according to the demand, and is not limited herein.
In an embodiment, according to a preset data processing method, processing abnormal data in a distribution network material data summary table to obtain a preprocessed data set includes: under the condition that the abnormal data indicate missing data, filling an average value of non-missing data which is consistent with the data type of the missing data in an idle position where the missing data is located in a distribution network material data summary table to obtain a preprocessed data set; or under the condition that the abnormal data indicate missing data, filling the median of non-missing data which is consistent with the data type of the missing data in the idle position where the missing data is located in the distribution network material data summary table to obtain a preprocessed data set. By the means, the accuracy and the integrity of data acquisition can be improved.
In an embodiment, according to a preset data processing method, processing abnormal data in a distribution network material data summary table to obtain a pre-processing data set of distribution network materials includes: and under the condition that the abnormal data indicate missing data, performing zero filling processing in an idle position where the missing data are located in the distribution network material data summary table to obtain a preprocessed data set. By the means, the integrity of data acquisition can be improved.
In an embodiment, according to a preset data processing method, processing abnormal data in a distribution network material data summary table to obtain a preprocessed data set includes: under the condition that peak occurs in the exit and exit database record data in the distribution network material data summary table, smoothing the exit and exit database record data at the position of the peak to obtain a preprocessed data set; the use recording parameters of the historical distribution network materials comprise exit and exit database recording data. The recorded data of the quit warehouse in the distribution network material data summary table may have a spike phenomenon, the common form is abnormal data with multiple orders of magnitude higher than the normal level, and the spike data is very unfavorable for model training and needs to be processed.
Optionally, the smoothing processing of the exit-exit database record data with the spike phenomenon in the distribution network material data summary table includes: calculating the difference value of the daily warehouse exit record in the distribution network material data summary table to obtain a daily net warehouse exit value; calculating the average value P of the peak value of the daily net ex-warehouse volume in the current natural month and the average value N of the daily normal net ex-warehouse volume in the current natural month in the distribution network material data summary table; when P/N is less than or equal to 1.5, replacing each peak value in daily net warehouse-out amount in the current natural month by the average value P; when P/N is more than 1.5, calculating the average value P ' of the peak value of the daily net delivery volume of the quarter of the current natural month and the average value N ' of the daily normal net delivery volume of the quarter, and when P '/N ' is less than or equal to 1.5, replacing each peak value in the daily net delivery volume of the current natural month by the average value P '; and when P '/N' > 1.5, deleting each peak value in daily net warehouse-out quantity in the current natural month. By the means, the accuracy and the integrity of data acquisition can be improved.
In an embodiment, step S206 includes converting the attribute information of the power distribution network construction material project to be tested in the preprocessed data set into a vector by using a one-bit efficient coding method; the attribute information of the power distribution network construction material project to be tested comprises a project type, a project name and a project department; taking the vectorized attribute information of the power distribution network construction material project to be tested as the data entry characteristics in the preprocessing data set; obtaining a cosine distance matrix between any two data entry features; clustering data entries corresponding to the data entry features based on the cosine distance matrix to obtain data cluster clusters corresponding to the data entry features; each data clustering cluster comprises an array set of data entries corresponding to the data entry features.
In an embodiment, step S208 includes, before, acquiring a plurality of historical sample usage amounts generated in the target time period; the historical sample usage comprises historical use record parameters of distribution network materials, distribution network historical planning data and distribution network region economic development data; and training the initialized material estimation model by using the use amount of the plurality of historical samples until the material estimation model with the training result reaching the convergence condition is obtained.
In one embodiment, step S208 includes: constructing a characteristic information table with data items as units according to the material types to be predicted and the corresponding time of the material types; dividing the characteristic information table to obtain a preprocessed subdata set; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; clustering the preprocessed sub data set to obtain a data cluster; and inputting the data clustering cluster into a material estimation model to output the material usage corresponding to the characteristic information table.
In the embodiment of the invention, the use record parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area are preprocessed to obtain a preprocessed data set of the distribution network materials of the target area; dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; clustering the preprocessed sub data set to obtain a data cluster; the data clustering is input into a material prediction model to output a prediction material usage mode required by the power distribution network construction material project operated in the target area, and the purpose of reducing errors of distribution network material prediction results and time reality is achieved, so that the technical effects of improving the accuracy of distribution network material prediction results, shortening prediction time and improving the working efficiency of material management departments are achieved, and the technical problem of large subjective judgment errors caused by traditional distribution network material manual prediction is solved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Based on the above embodiment, in an application embodiment, as shown in fig. 3, the method for predicting demand for materials of a power distribution network may include the following steps:
step S302, data access; and importing the use history (exit and exit library record) of the distribution network materials, the power grid planning data and the regional economic development indexes into a server database from a data source to be used as historical data. Distribution network material use history records and power grid Planning data are provided by an Enterprise Resource Planning (ERP) system of a power grid company, and regional economic development indexes are collected from a third-party regional development data source.
Step S304, preprocessing data; (1) sorting the historical data items in the target area according to dates and then corresponding the historical data items one by one to form a data summary table DoriThe date span is freely defined as daily, monthly, yearly, etc. according to the requirements. The actual delivery quantity of the goods and materials on each date is used as the quantity information of the needed goods and materials, and the rest items are used as characteristic information.
For the data missing and abnormal conditions existing in the table, the checking and processing are required, and the processing steps are as follows:
firstly, for the large-area continuous data loss in the table, the actual material transfer condition of the power grid company in the corresponding time period needs to be investigated, and whether the real data loss or no record is caused by no material transfer in the corresponding time period is judged. For authenticity data loss, data needs to be recovered as far as possible, the missing part is completed, and if large-area data loss still exists after completion, rows of the missing data are deleted, so that adverse effects on model training are avoided. For the case of no recording, 0 complementing processing is performed.
Secondly, for small-range jumping data loss in the table, common data filling methods (such as mean filling, median and the like) can be adopted to process the missing data. The fact shows that the total amount of the materials discharged from the power grid company has certain regularity on a time scale, and the materials can be filled according to the front and back material transfer conditions in practice.
And thirdly, peak phenomena may exist in the data of the exit and exit database records in the table, the common form is abnormal data with multiple orders of magnitude higher than the normal level, and the peak data is very unfavorable for model training and needs to be processed. The invention adopts the following mode to smooth the peak data in the exit-exit library record:
1. calculating the difference of the warehouse-out records in the table to obtain daily net warehouse-out amount information;
2. calculating the average value peak of the peak value of the daily net warehouse-out quantity in the month by taking the month as a unitaveAnd the average norm of the normal daily net delivery of the monthave
3. When in use
Figure BDA0002821476550000121
When the peak is availableaveFilling each peak value in daily net warehouse-out quantity of the month;
when in use
Figure BDA0002821476550000122
Then, the average value peak of the daily net warehouse-out amount peak data of the quarter of the month is recalculatedaveAnd the average norm of other normal daily net exports for that quarteraveAnd the judgment is made again.
4. If the peak data cannot be eliminated by adopting the 2 and 3 modes, other common methods can be adopted for filling or directly deleting, and the interference to model training is reduced. The processed data summary table is not stored in an overlay mode and is independently stored as a new data summary table Dtre
(3) The XGboost algorithm takes a CART tree as a basic unit, and does not need to normalize or standardize input data, but needs to vectorize other non-digital characteristics such as texts. Data summary table D in the invention is coded by one-bit effective one-hot coding modetreThe non-numeric characteristics of the item type, the item name and the like in the (1) are converted into vectors.
Step S306, in this embodiment, the data set is divided according to the following steps:
(1) using the vectorized key words of project type, project name, project department and the like, and the attribute information of the power distribution network construction material project as the characteristics of each data entry to calculate the cosine distance matrix L between each data entrytre
(3) Clustering the data items by using DB-Scan analysis algorithm and L as distance measurement standard to form a plurality of cluster clusters, and recording cluster center coordinate record C1,C2,C3… are provided. Data entries within the same cluster are partitioned into data set X1,X2,X3…, and adding the corresponding label.
(3) Through the steps of the pair DtreSet-by-set partitioning to form DtraIn the form of Dtra={X1,X2,X3…XmAnd the data items in each data set have the most similar information such as item names and item types.
(4) Each data set is then keyed to the subclass of materials, and DtraIn each XiFurther dividing to form Xi1,Xi2…Xis(i ═ 1 … m). Finally, the original data set is divided into Dtra={[X11,X12…X1s],[X21,X21…X2s],[X31,X32…X32…X3s]…[Xm1…Xms]}。
Each data subset XijTraining regression prediction model M aloneijThe parallel processing of the large data set is realized, and the prediction precision of the model aiming at the specific material subclass is strengthened. Data set DtraAnd storing the data in a server database, and performing rolling update periodically.
Step S308, training a model; use of D in the inventiontraAs training set, divided data subset XijThe method comprises complete characteristic information such as time, material subclass scheduling information and economic development indexes of various areas, and an XGboost algorithm is utilized to train a material demand prediction model. Model parameter adjustment adopts a common grid search mode of machine learning to adjust parameters item by itemAnd finally determining the optimal parameter combination.
Step S310, judging whether the requirements are met; when the material demand forecasting model reaches the optimal or set requirement, taking the model parameter with the optimal objective function value as the final result; if the optimal model parameter of the objective function value is not reached, the procedure goes to step S308 to adjust and retrain the material demand prediction model parameter. After the material demand forecasting model is trained successfully, all parameters are stored in a server database to be used as backups, and the model can be called and used by a web end or other applications. After a period of regular operation, reading a training data set D updated in a rolling way from a server databasetraAnd retraining the regression prediction model to ensure the real-time performance of the model.
Step S312, distribution network material demand prediction; (1) constructing a complete characteristic information table by taking data entries as units according to the types of materials to be predicted and corresponding time; (2) preprocessing the characteristic information table according to the mode in the second step to form DtesCalculating DtesEach data entry and DtraIn each cluster center CiCosine distance table Ltes(ii) a (3) Classifying the characteristic information according to the cosine distance maximum principle, and respectively sending the characteristic information to corresponding material subclass prediction models MijAnd performing prediction calculation. For a specific material subclass a, the demand prediction result is a prediction model [ M ] corresponding to each material subclass1a,M2a…Mma]The sum of the predicted results of (1).
In the embodiment of the invention, the use record parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area are preprocessed to obtain a preprocessed data set of the distribution network materials of the target area; dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; clustering the preprocessed sub data set to obtain a data cluster; the data clustering is input into a material prediction model to output a prediction material usage mode required by the power distribution network construction material project operated in the target area, and the purpose of reducing errors between a distribution network material prediction result and the actual distribution network material prediction result is achieved, so that the technical effects of improving the accuracy of the distribution network material prediction result, shortening the prediction time and improving the working efficiency of a material management department are achieved, and the technical problem of large subjective judgment errors caused by traditional distribution network material manual prediction is solved.
According to another aspect of the embodiment of the invention, a power distribution network material demand prediction device for implementing the power distribution network material demand prediction method is further provided. As shown in fig. 4, the apparatus includes:
the preprocessing unit 402 is configured to preprocess usage recording parameters of historical distribution network materials of the target area, distribution network historical planning data, and distribution network area economic development data to obtain a preprocessing data set of distribution network materials of the target area;
a dividing unit 404, configured to divide the preprocessed data set to obtain preprocessed sub-data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
a clustering unit 406, configured to cluster the preprocessed sub data sets to obtain data cluster clusters;
the estimation unit 408 is used for inputting the data clustering cluster into a material estimation model to output estimated material usage required by the power distribution network construction material project operated in the target area; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data.
In the embodiment of the invention, historical distribution network material use recording parameters and power network annual investment planning data of a power grid company in a target area can be used as model training data, and annual main economic development indexes of an area covered by a distribution network are added as characteristic data, so that the power distribution material prediction result is more accurate. Because the use amount of distribution network materials has a direct relation with the power grid planning, and the power grid planning is influenced by local economic development and population quantity. Therefore, the investment planning data, the economic development index and the population index of the power grid are important influence factors for distribution network material demand prediction and are used as influence data for distribution network material demand prediction. Therefore, the historical distribution network material use record of the power grid company and the historical investment planning data of the power grid are used as model training data, so that the predicted data are accessed more sufficiently, and the historical main economic development indexes of the area covered by the distribution network are added as characteristic data, so that the prediction result is more accurate.
Here, the historical distribution network material usage recording parameters, the historical investment planning data of the power grid, and the economic development data of the distribution network region may be in the form of a data table or in the form of an XLM format document, which is not limited herein. Preprocessing the historical distribution network material use recording parameters, the power grid annual investment planning data and the distribution network area economic development data to obtain a preprocessing data set of the distribution network materials of a target area; the preprocessing data set obtaining comprises the step of sequencing historical distribution network material use recording parameters, power grid historical investment planning data and distribution network area economic development data according to dates to form a data summary table, wherein the data summary table can include but is not limited to an EXCLE data table. The data missing and abnormal conditions existing in the data summary table need to be checked and processed, here, a mean value method can be adopted to fill missing data, or a method from a median to the position of the missing data is filled, data smoothing is carried out on a data spike phenomenon occurring in the data summary table, and a preprocessing data set of the distribution network materials in a target area is further obtained.
In the embodiment of the invention, the preprocessed data set is divided to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested; here, the attribute of the power distribution network construction material project may be a project type, a project name, a project department, a warehouse-out material type, a warehouse-out material name, and the like of the power construction, and is not limited herein. That is, for example, the preprocessed data set may be divided according to the different attributes to obtain a plurality of EXCLE data tables with the project type, the project name, the project department, the ex-warehouse material type, and the ex-warehouse material of the power construction as the header type.
In the embodiment of the invention, the preprocessed sub data sets are clustered to obtain data cluster clusters, for example, a plurality of EXCLE data tables with the types of the project, the project name, the project department, the warehouse-out material and the warehouse-out material as the header types of the power construction are used as distance measurement standards, and the data entries are clustered to form a plurality of cluster clusters, so that the data cluster clusters with the highest similarity, such as the attributes of the power distribution network construction material project, are obtained.
In the embodiment of the invention, the material prediction model can use but is not limited to an eXtreme Gradient boost (XGboost) prediction model, the XGboost prediction model is fast in training speed, low in data quality requirement and good in nonlinear factor capture performance, the complexity of data preprocessing can be obviously reduced, and the method has obvious advantages in actual engineering. The XGboost is a tree integration model, and the operation mode is that the CART tree with the total amount of K in the model predicts the same data set, and the prediction results are summed to be used as the final prediction value. Namely:
Figure BDA0002821476550000171
in the formula (I), the compound is shown in the specification,
Figure BDA0002821476550000172
as a data set xiCorresponding prediction results; f. ofkA prediction model for the kth tree.
In the XGboost model training process, each tree model adopts the same form of objective function as the model precision evaluation index:
Figure BDA0002821476550000173
in the formula (I), the compound is shown in the specification,
Figure BDA0002821476550000174
for the loss function, the mode is expressedFitting accuracy of the model to the training set;
Figure BDA0002821476550000175
too high a model complexity, which is a complexity function, will result in an overfitting phenomenon.
The square loss function can be used as the loss function in this embodiment, as follows:
Figure BDA0002821476550000176
for a single CART tree in the XGboost model, the complexity function is defined as:
Figure BDA0002821476550000177
wherein T is the total leaf node number of the model, omega2Is the L2 norm of each leaf node weight vector. And gamma and lambda are used as adjustable penalty coefficients to adjust the complexity of the XGboost model.
Optionally, but not limited to, the preprocessing data set is used as a training set, all the partitioned preprocessing data subsets include complete time, power material subclass scheduling information, various regional economic development indexes and other characteristic information, and the XGBoost algorithm is used for training the material demand prediction model. And adjusting model parameters by adopting a common grid searching mode of machine learning, adjusting parameters item by item, and finally determining the optimal parameter combination of the electric power materials.
In addition, the performance of the XGBoost model may also be judged. When the model training reaches the optimal or set requirement, taking the model parameter with the optimal objective function value as the final result; otherwise, the model parameters are adjusted and retrained. After the XGboost model is trained successfully, all parameters are stored in a server database to be used as backups, and the XGboost model can be called and used by a web terminal or other applications. After the XGboost model is operated for a period of time regularly, a training data set updated in a rolling mode is read from a server database, and the regression prediction model is retrained so as to guarantee the real-time performance of the XGboost model calculation data.
According to the historical demand condition of the distribution network materials and the planning data of the power grid in the past year, the main economic development indexes of the distribution network coverage area in the past year are combined to serve as characteristics, parallel training is carried out by adopting an XGboost integrated learning algorithm, a high-precision power distribution network material demand prediction model is established, the power grid material department is assisted to carry out material purchasing and material scheduling work, the risk of material shortage and material backlog is reduced, and enterprise funds are saved.
According to another aspect of the embodiment of the present invention, there is further provided an electronic device for implementing the method for predicting demand of materials for power distribution networks, where the electronic device may be the terminal device 104 or the server 114 shown in fig. 1. As shown in fig. 5, the electronic device comprises a memory 502 and a processor 504, the memory 502 having stored therein a computer program, the processor 504 being arranged to perform the steps of any of the above-described method embodiments by means of the computer program.
Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, preprocessing the use record parameters of the historical distribution network materials of the target area, the historical distribution network planning data and the economic development data of the distribution network area to obtain a preprocessed data set of the distribution network materials of the target area;
s2, dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
s3, clustering the preprocessed sub data sets to obtain data cluster clusters;
s4, inputting the data clustering cluster into a material estimation model to output estimated material usage required by the power distribution network construction material project operated in the target area; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data.
Alternatively, it is understood by those skilled in the art that the structure shown in fig. 5 is only an illustration, and the electronic device or electronic device may also be a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a Mobile Internet Devices (MID), a PAD, or other electronic Devices. Fig. 5 is a diagram illustrating a structure of the electronic device. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 5, or have a different configuration than shown in FIG. 5.
The memory 502 may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for predicting the material demand of the power distribution network in the embodiment of the present invention, and the processor 504 executes various functional applications and data processing by running the software programs and modules stored in the memory 502, that is, the method for predicting the material demand of the power distribution network is implemented. The memory 502 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 502 may further include memory located remotely from the processor 504, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 502 may be specifically, but not limited to, used for storing information such as attribute information of a virtual power distribution network construction material project. As an example, as shown in fig. 5, the memory 502 may include, but is not limited to, the preprocessing unit 402, the dividing unit 404, the clustering unit 406, and the predicting unit 408 of the power distribution network material demand predicting device. In addition, the device may further include, but is not limited to, other module units in the foregoing power distribution network material demand prediction apparatus, which is not described in this example again.
Optionally, the transmission device 506 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 506 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 506 is a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In addition, the electronic device further includes: a display 508 for displaying the attribute information of the power distribution network construction material project; and a connection bus 510 for connecting the respective module parts in the above-described electronic apparatus.
In other embodiments, the terminal device or the server may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting a plurality of nodes through a network communication. Nodes can form a Peer-To-Peer (P2P, Peer To Peer) network, and any type of computing device, such as a server, a terminal, and other electronic devices, can become a node in the blockchain system by joining the Peer-To-Peer network.
According to a further aspect of an embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for executing the steps of:
s1, preprocessing the use record parameters of the historical distribution network materials of the target area, the historical distribution network planning data and the economic development data of the distribution network area to obtain a preprocessed data set of the distribution network materials of the target area;
s2, dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
s3, clustering the preprocessed sub data sets to obtain data cluster clusters;
s4, inputting the data clustering cluster into a material estimation model to output estimated material usage required by the power distribution network construction material project operated in the target area; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be substantially or partially implemented in the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, or network devices) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a division of a logic function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims (11)

1. A method for predicting material demand of a power distribution network is characterized by comprising the following steps:
preprocessing the use recording parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area to obtain a preprocessed data set of the distribution network materials of the target area;
dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
clustering the preprocessed sub data set to obtain a data cluster;
inputting the data clustering cluster into a material estimation model to output estimated material usage required by the target area to operate the power distribution network construction material project; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data.
2. The method of claim 1, wherein the preprocessing of the usage recording parameters of the historical distribution network materials, the historical distribution network planning data and the economic development data of the distribution network region of the target region to obtain the preprocessed data set of the distribution network materials of the target region comprises:
sequencing the obtained use record parameters of the historical distribution network materials, the distribution network historical planning data and the distribution network area economic development data according to the generation date respectively, and summarizing to obtain a distribution network material data summary table;
and processing abnormal data in the distribution network material data summary table according to a preset data processing method to obtain the preprocessed data set.
3. The method of claim 2, wherein the processing abnormal data in the distribution network material data summary table according to a preset data processing method to obtain the preprocessed data set comprises:
under the condition that the abnormal data indicate missing data, filling an average value of non-missing data which is consistent with the data type of the missing data in an idle position where the missing data is located in the distribution network material data summary table to obtain the preprocessing data set; or
And under the condition that the abnormal data indicate missing data, filling a median of non-missing data which is consistent with the data type of the missing data in an idle position where the missing data is located in the distribution network material data summary table to obtain the preprocessed data set.
4. The method of claim 2, wherein the processing the abnormal data in the distribution network material data summary table according to a preset data processing method to obtain a pre-processing data set of the distribution network materials comprises:
and under the condition that the abnormal data indicate missing data, performing zero padding processing in an idle position where the missing data are located in the distribution network material data summary table to obtain the preprocessed data set.
5. The method of claim 2, wherein the processing abnormal data in the distribution network material data summary table according to a preset data processing method to obtain the preprocessed data set comprises:
under the condition that peak occurs in the exit-exit database record data in the distribution network material data summary table, smoothing the exit-exit database record data at the position of the peak to obtain the preprocessed data set; and the use record parameters of the historical distribution network materials comprise the exit and exit database record data.
6. The method of claim 1, wherein clustering the preprocessed sub data set to obtain data cluster clusters comprises:
converting the attribute information of the to-be-detected power distribution network construction material project in the preprocessed data set into a vector by adopting a one-bit effective coding mode; the attribute information of the power distribution network construction material project to be tested comprises a project type, a project name and a project department;
taking the vectorized attribute information of the power distribution network construction material project to be tested as the data entry characteristics in the preprocessing data set;
obtaining a cosine distance matrix between any two data entry features;
based on the cosine distance matrix, clustering the data entries corresponding to the data entry features to obtain a plurality of data cluster clusters corresponding to the data entry features; and each data clustering cluster comprises an array set of data entries corresponding to the data entry features.
7. The method of claim 1, wherein prior to inputting the preprocessed sub-data set into a pre-estimated model to output an estimated material usage corresponding to the preprocessed sub-data set, comprising:
obtaining the usage amount of a plurality of historical samples generated in a target time period; the historical sample usage amount comprises historical usage record parameters of distribution network materials, distribution network historical planning data and distribution network region economic development data;
and training the initialized material estimation model by using the using amount of the plurality of historical samples until the material estimation model with the training result reaching the convergence condition is obtained.
8. The method of claim 1, wherein the clustering the data into a material forecast model to output a forecast material usage amount required by the target area to operate the power distribution network construction material project comprises:
constructing a characteristic information table with data items as units according to the material types to be predicted and the corresponding time of the material types;
dividing the characteristic information table to obtain a preprocessed subdata set; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
clustering the preprocessed sub data set to obtain a data cluster;
and inputting the data clustering cluster into a material estimation model to output the material usage amount corresponding to the characteristic information table.
9. The utility model provides a power distribution network goods and materials demand prediction device which characterized in that includes:
the system comprises a preprocessing unit, a data processing unit and a data processing unit, wherein the preprocessing unit is used for preprocessing usage recording parameters of historical distribution network materials of a target area, historical distribution network planning data and economic development data of the distribution network area to obtain a preprocessing data set of the distribution network materials of the target area;
the dividing unit is used for dividing the preprocessed data set to obtain preprocessed sub data sets; the preprocessed sub data set comprises attribute information of a power distribution network construction material project to be tested;
the clustering unit is used for clustering the preprocessed sub data set to obtain a data clustering cluster;
the estimation unit is used for inputting the data clustering cluster into a material estimation model so as to output estimated material usage required by the power distribution network construction material project operated in the target area; the material estimation model is a decision model for predicting the use amount of materials, which is obtained by carrying out multiple times of training by utilizing sample material data.
10. A computer-readable storage medium, comprising a stored program, wherein the program when executed performs the method of any one of claims 1 to 8.
11. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program and the processor is arranged to execute the method of any of claims 1 to 8 by means of the computer program.
CN202011419119.4A 2020-12-07 2020-12-07 Power distribution network material demand prediction method and device, storage medium and electronic equipment Active CN112614011B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011419119.4A CN112614011B (en) 2020-12-07 2020-12-07 Power distribution network material demand prediction method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011419119.4A CN112614011B (en) 2020-12-07 2020-12-07 Power distribution network material demand prediction method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN112614011A true CN112614011A (en) 2021-04-06
CN112614011B CN112614011B (en) 2024-03-15

Family

ID=75229560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011419119.4A Active CN112614011B (en) 2020-12-07 2020-12-07 Power distribution network material demand prediction method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112614011B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361745A (en) * 2021-05-07 2021-09-07 云南电网有限责任公司曲靖供电局 Power distribution network material demand prediction method and system
CN113591993A (en) * 2021-08-02 2021-11-02 上海华能电子商务有限公司 Power material demand prediction method based on space-time clustering
CN115471168A (en) * 2021-12-14 2022-12-13 国网上海市电力公司 Automatic flow processing method and device, electronic equipment and computer readable medium
CN115796398A (en) * 2023-02-06 2023-03-14 佰聆数据股份有限公司 Intelligent demand analysis method, system, equipment and medium based on electric power materials
CN116502771A (en) * 2023-06-21 2023-07-28 国网浙江省电力有限公司宁波供电公司 Power distribution method and system based on electric power material prediction

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819772A (en) * 2012-08-29 2012-12-12 广东电网公司 Method and device for predicating demand of goods and materials for power distribution network construction
CN102831489A (en) * 2012-08-29 2012-12-19 广东电网公司 Prediction method and device for material requirements for construction of power distribution network
CN111445009A (en) * 2020-03-25 2020-07-24 国家电网有限公司 Method for predicting material purchasing demand based on GRU network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819772A (en) * 2012-08-29 2012-12-12 广东电网公司 Method and device for predicating demand of goods and materials for power distribution network construction
CN102831489A (en) * 2012-08-29 2012-12-19 广东电网公司 Prediction method and device for material requirements for construction of power distribution network
CN111445009A (en) * 2020-03-25 2020-07-24 国家电网有限公司 Method for predicting material purchasing demand based on GRU network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韩戟;何成浩;苏星;施成云;刘东映;: "一种基于SVM的电力行业物资需求预测方法", 电气技术, no. 12, pages 166 - 168 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361745A (en) * 2021-05-07 2021-09-07 云南电网有限责任公司曲靖供电局 Power distribution network material demand prediction method and system
CN113591993A (en) * 2021-08-02 2021-11-02 上海华能电子商务有限公司 Power material demand prediction method based on space-time clustering
CN115471168A (en) * 2021-12-14 2022-12-13 国网上海市电力公司 Automatic flow processing method and device, electronic equipment and computer readable medium
CN115796398A (en) * 2023-02-06 2023-03-14 佰聆数据股份有限公司 Intelligent demand analysis method, system, equipment and medium based on electric power materials
CN116502771A (en) * 2023-06-21 2023-07-28 国网浙江省电力有限公司宁波供电公司 Power distribution method and system based on electric power material prediction
CN116502771B (en) * 2023-06-21 2023-12-01 国网浙江省电力有限公司宁波供电公司 Power distribution method and system based on electric power material prediction

Also Published As

Publication number Publication date
CN112614011B (en) 2024-03-15

Similar Documents

Publication Publication Date Title
CN112614011B (en) Power distribution network material demand prediction method and device, storage medium and electronic equipment
CN110400022B (en) Cash consumption prediction method and device for self-service teller machine
CN110400021B (en) Bank branch cash usage prediction method and device
CN109685583B (en) Supply chain demand prediction method based on big data
CN107818344A (en) The method and system that user behavior is classified and predicted
CN111178624A (en) Method for predicting new product demand
CN108388955A (en) Customer service strategies formulating method, device based on random forest and logistic regression
CN114048436A (en) Construction method and construction device for forecasting enterprise financial data model
CN109948913A (en) A kind of multi-source feature power consumer composite portrait system based on double-deck xgboost algorithm
CN107590737A (en) Personal credit scores and credit line measuring method
CN109523299A (en) A kind of automatic Cost accounting method and system
CN109492863A (en) The automatic generation method and device of financial document
CN116776006B (en) Customer portrait construction method and system for enterprise financing
CN105447767A (en) Power consumer subdivision method based on combined matrix decomposition model
CN113177366A (en) Comprehensive energy system planning method and device and terminal equipment
CN113450141A (en) Intelligent prediction method and device based on electricity selling quantity characteristics of large-power customer groups
CN115034278A (en) Performance index abnormality detection method and device, electronic equipment and storage medium
CN116662860A (en) User portrait and classification method based on energy big data
CN114372835B (en) Comprehensive energy service potential customer identification method, system and computer equipment
CN115130924A (en) Microgrid power equipment asset evaluation method and system under source grid storage background
CN113779116B (en) Object ordering method, related equipment and medium
CN113537607B (en) Power failure prediction method
CN111768282B (en) Data analysis method, device, equipment and storage medium
CN115330201A (en) Power grid digital project pareto optimization method and system
CN114862243A (en) Data processing method and device for assistant decision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant