CN117132176B - Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening - Google Patents

Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening Download PDF

Info

Publication number
CN117132176B
CN117132176B CN202311371134.XA CN202311371134A CN117132176B CN 117132176 B CN117132176 B CN 117132176B CN 202311371134 A CN202311371134 A CN 202311371134A CN 117132176 B CN117132176 B CN 117132176B
Authority
CN
China
Prior art keywords
predictor
predictors
runoff
data
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311371134.XA
Other languages
Chinese (zh)
Other versions
CN117132176A (en
Inventor
李梦杰
刘琨
梁犁丽
殷兆凯
张玮
吕振豫
王鹏翔
张璐
黄康迪
余意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gezhouba Electric Power Rest House
China Three Gorges Corp
Original Assignee
Beijing Gezhouba Electric Power Rest House
China Three Gorges Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gezhouba Electric Power Rest House, China Three Gorges Corp filed Critical Beijing Gezhouba Electric Power Rest House
Priority to CN202311371134.XA priority Critical patent/CN117132176B/en
Publication of CN117132176A publication Critical patent/CN117132176A/en
Application granted granted Critical
Publication of CN117132176B publication Critical patent/CN117132176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Tourism & Hospitality (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of hydrologic forecasting, and discloses a method for constructing a runoff forecasting model based on forecasting factor screening, wherein the method for constructing the runoff forecasting model based on forecasting factor screening establishes a plurality of hypothesis tests for different forecasting factors through second-order approximation values of condition mutual information between different forecasting factor data and runoff data, screens key forecasting factors in different forecasting factors based on the established hypothesis tests, test statistic information, preset test statistic distribution, preset significance level and preset correction rules to obtain a target key forecasting factor set, and the reliability of the key forecasting factors obtained through screening is higher, so that the forecasting precision of the runoff forecasting model finally constructed based on the target key factor set is higher.

Description

Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening
Technical Field
The invention relates to the technical field of hydrologic forecasting, in particular to a runoff forecasting model construction and runoff forecasting method based on forecasting factor screening.
Background
The medium-long term runoff forecasting method based on data driving generally adopts a plurality of global hydrologic-meteorological factors such as atmospheric circulation and sea temperature indexes as forecasting factors of runoff forecasting, and performs relation fitting with runoff to construct a runoff forecasting model, and the medium-long term forecasting of runoff is further realized based on the constructed runoff forecasting model. However, due to the larger data time scale used for mid-to-long term hydrologic forecasting, fewer data samples result. For high-dimensional predictors, an effective prediction model cannot be established when data samples are fewer. Therefore, how to screen out the key forecasting factors from the forecasting factors, the dimension of the forecasting factors is reduced, and the method is important to improving the forecasting precision of the runoff forecasting model.
In the related art, the correlation degree between two variables is represented by calculating the condition mutual information between each predictor and runoff variation data, comparing the condition mutual information corresponding to each predictor with a preset correlation degree threshold, and taking the predictor with the condition mutual information larger than the preset correlation degree threshold as a key predictor. However, the preset relevance threshold is generally determined based on human experience, and the screening of the key predictors is performed based on the preset threshold, so that some non-key predictors may be incorrectly screened as key predictors, so that the accuracy of the screening result is poor; because the accuracy of the key forecasting factors obtained after screening is poor, the forecasting accuracy of the model obtained by construction is low.
Disclosure of Invention
In view of the above, the invention provides a method for constructing a runoff forecasting model based on forecasting factor screening and forecasting the runoff, which aims to solve the problem of lower forecasting precision of the constructed model caused by lower credibility of key forecasting factors obtained by screening in the related technology.
In a first aspect, the present invention provides a method for constructing a runoff forecasting model based on forecasting factor screening, the method comprising: acquiring a plurality of different predictor data and runoff data; respectively calculating second-order approximation values of the condition mutual information between different forecasting factor data and runoff data; establishing hypothesis tests respectively corresponding to different predictors based on second-order approximation values of condition mutual information between different predictors and runoff data, and determining test statistics of the hypothesis tests corresponding to each predictor; screening key predictors contained in different predictors based on hypothesis tests corresponding to the predictors, test statistics of the hypothesis tests corresponding to the predictors, preset test statistics distribution, preset significance level and preset correction rules to obtain a target key predictor set; and constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data.
According to the runoff forecasting model construction method based on forecasting factor screening, second-order approximation values of condition mutual information between different forecasting factor data and runoff data are calculated based on different forecasting factor data and the runoff data; determining hypothesis tests corresponding to different predictors and test statistic information corresponding to each hypothesis test based on second-order approximation values of condition mutual information between different predictors and runoff data; screening key predictors in different predictors based on the detection statistic information corresponding to each predictor, hypothesis detection corresponding to each predictor, preset detection statistic distribution, preset significance level and preset correction rules to obtain a target key predictor set; and constructing a runoff forecasting model based on the target key factor set and the runoff data. According to the method provided by the invention, based on the second-order approximation value of the condition mutual information between different predictor data and runoff data, a plurality of hypothesis tests are established for different predictors, and based on the established hypothesis tests, the test statistic information, the preset test statistic distribution, the preset significance level and the preset correction rule, the key predictors in the different predictors are screened to obtain a target key predictor set, the reliability of the screened key predictors is higher, and the prediction precision of the runoff prediction model finally constructed based on the target key factor set is higher.
In an alternative embodiment, the step of screening the key predictors included in different predictors to obtain a target key predictor set based on the hypothesis test corresponding to each predictor, the test statistic information of the hypothesis test corresponding to each predictor, the preset test statistic distribution, the preset significance level, and the preset correction rule includes: determining a first distribution threshold based on a preset correction rule, a preset significance level, and a number of different predictors; comparing the test statistic information corresponding to each predictor with a first distribution threshold value to obtain comparison results corresponding to different predictors; determining at least one key predictor in a plurality of different predictors based on the comparison results corresponding to the predictors, and incorporating the key predictor into a key predictor set; determining remaining predictors based on the key predictors; determining a second distribution threshold based on a preset correction rule, a preset significance level, and a remaining number of predictors; comparing the test statistic information of each residual predictor with a second distribution threshold value to obtain comparison results respectively corresponding to different residual predictors; determining at least one key predictor in the residual predictors based on the comparison results respectively corresponding to the residual predictors, and incorporating the key predictors into a key predictor set; and returning to the step of determining the residual predictors until no key predictors exist in the residual predictors, and obtaining a target key factor set.
According to the method provided by the alternative embodiment, the second distribution threshold value is determined through the preset correction rule, the preset significance level and the number of the residual different predictors, and the residual predictors are screened based on the second distribution threshold value until no key predictors exist in the residual predictors, so that a target key factor set is obtained, and the screening efficiency is improved while the reliability of the screening result of the predictors is improved.
In an alternative embodiment, the step of determining the first distribution threshold based on a preset correction rule, a preset significance level, and a number of different predictors comprises: determining a target significance level based on a preset correction rule, a preset significance level, and a number of different predictors; a first distribution threshold is determined based on the target salient information level and a preset test statistic distribution.
In an alternative embodiment, the step of constructing the runoff forecasting model based on the set of target key forecasting factors and the runoff data includes: correlating each key predictor data in the target key predictor set with the runoff data to obtain a correlation data set; training the preset model based on the associated data set until the accuracy requirement of the preset model is met, and obtaining the runoff forecasting model.
In an alternative embodiment, the step of calculating second order approximations of the conditional mutual information between the different predictor data and the runoff data, respectively, comprises: calculating a first frequency estimated density function value corresponding to each predictor data and runoff data; calculating a second frequency estimated density function value corresponding to each piece of predictor data and other pieces of predictor data respectively; calculating a third frequency estimated density function value among each predictor data, other predictor data and runoff data, wherein the other predictors are used for representing predictors except the current predictor in a plurality of different predictors; and determining second-order approximate values of the condition mutual information between the different predictor data and the runoff data based on the first frequency estimated density function, the second frequency estimated density function and the third frequency estimated density function which are respectively corresponding to the different predictors.
In an alternative embodiment, the step of determining at least one key predictor among a plurality of different predictors based on the comparison result corresponding to each predictor, and incorporating the key predictor into the set of key predictors includes: sequentially judging whether the test statistic information corresponding to different forecasting factors is larger than a first distribution threshold value or not; if the test statistic information of the predictors is larger than the first distribution threshold, the predictors are used as key predictors, and the key predictors are included in the key predictor set.
In a second aspect, the present invention provides a method of forecasting runoff, the method comprising: acquiring different key forecasting factor data of a target; inputting different key predictor data of targets into a pre-constructed runoff prediction model, so that the runoff prediction model outputs corresponding runoff change information, and the runoff prediction model is constructed by the runoff prediction model construction method based on predictor screening according to the first aspect or any corresponding embodiment of the first aspect.
In a third aspect, the present invention provides a runoff forecasting model construction device based on forecasting factor screening, the device comprising: the first acquisition module is used for acquiring a plurality of different forecasting factor data and runoff data; the calculation module is used for calculating second-order approximation values of the condition mutual information between different forecasting factor data and runoff data respectively; the first determining module is used for establishing hypothesis tests respectively corresponding to different predictors based on second-order approximation values of condition mutual information between different predictors and runoff data, and determining test statistics of the hypothesis tests corresponding to each predictor; the screening module is used for screening the key predictors contained in different predictors based on hypothesis tests corresponding to the predictors, test statistic information of the hypothesis tests corresponding to the predictors, preset test statistic distribution, preset significance level and preset correction rules to obtain a target key predictor set; and the construction module is used for constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data.
In an alternative embodiment, the screening module includes: a first determining sub-module for determining a first distribution threshold based on a preset correction rule, a preset significance level, and a number of different predictors; the first comparison sub-module is used for comparing the test statistic information corresponding to each predictor with a first distribution threshold value to obtain comparison results corresponding to different predictors; the second determining submodule is used for determining at least one key predictor in a plurality of different predictors based on the comparison results corresponding to the predictors and incorporating the key predictor into the key predictor set; a third determining sub-module for determining remaining predictors based on the key predictors; a fourth determining sub-module for determining a second distribution threshold based on a preset correction rule, a preset significance level, and a number of remaining predictors; the second comparison sub-module is used for comparing the test statistic information of each residual predictor with a second distribution threshold value to obtain comparison results respectively corresponding to different residual predictors; a fifth determining submodule, configured to determine at least one key predictor from the remaining predictors based on the comparison results corresponding to the remaining predictors, and incorporate the key predictor into the key predictor set; and a sixth determining submodule, configured to return to the step of determining the remaining predictors until no key predictors exist in the remaining predictors, thereby obtaining a target key factor set.
In an alternative embodiment, the first determining submodule includes: a first determining unit configured to determine a target significance level based on a preset correction rule, a preset significance level, and a number of different predictors; and the second determining unit is used for determining the first distribution threshold value based on the target significant information level and the preset test statistic distribution.
In an alternative embodiment, the building block comprises: the association sub-module is used for associating each key predictor data in the target key predictor set with the runoff data to obtain an association data set; and the training sub-module is used for training the preset model based on the associated data set until the accuracy requirement of the preset model is met, and obtaining the runoff forecasting model.
In an alternative embodiment, the computing module includes: the first calculating sub-module is used for calculating a first frequency estimated density function value corresponding to each predictor data and each runoff data; the second calculation sub-module is used for calculating a second frequency estimated density function value corresponding to each piece of forecasting factor data and other pieces of forecasting factor data respectively; a third calculation sub-module for calculating a third frequency estimated density function value among each of the predictor data, other predictor data, and runoff data, the other predictors being used for characterizing predictors other than the current predictor among the plurality of different predictors; and the seventh determining submodule is used for determining second-order approximate values of the condition mutual information between the different predictor data and the runoff data based on the first frequency estimated density function, the second frequency estimated density function and the third frequency estimated density function which are respectively corresponding to the different predictors.
In a fourth aspect, an embodiment of the present invention provides a runoff forecasting apparatus, including: the second acquisition module is used for acquiring different key predictor data of the target; the second determining module is used for inputting different key predictor data of the targets into a pre-constructed runoff prediction model, so that the runoff prediction model outputs corresponding runoff change information, and the runoff prediction model is constructed by the runoff prediction model construction method based on predictor screening in the first aspect or any corresponding embodiment of the first aspect.
In a fifth aspect, the present invention provides a computer device comprising: the processor is in communication connection with the memory, and the memory stores computer instructions, so that the processor executes the computer instructions to execute the method for constructing the runoff forecasting model based on forecasting factor screening according to the first aspect or any implementation mode corresponding to the first aspect, or execute the method for forecasting runoff according to the second aspect.
In a sixth aspect, the present invention provides a computer readable storage medium having stored thereon computer instructions for causing a computer to execute the method for constructing a runoff forecasting model based on forecasting factor screening according to the first aspect or any one of the embodiments corresponding thereto, or to execute the method for forecasting runoff according to the second aspect.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a method for constructing a runoff forecasting model based on forecasting factor screening according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for constructing a runoff forecasting model based on forecasting factor screening according to an embodiment of the present invention;
FIG. 3 is a flow chart of yet another method for constructing a runoff forecasting model based on forecasting factor screening, according to an embodiment of the present invention;
FIG. 4 is a flow chart of a runoff forecasting method according to an embodiment of the present invention;
FIG. 5 is a block diagram of a construction device of a runoff forecasting model based on forecasting factor screening according to an embodiment of the present invention;
FIG. 6 is a block diagram of a runoff forecasting device according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the related art, the mutual condition information between each predictor and the runoff variation data is generally calculated, the mutual condition information corresponding to each predictor is compared with a preset threshold value, and the predictor with the mutual condition information greater than the preset threshold value is used as a key predictor. However, the preset threshold is generally determined based on artificial experience, so that the reliability of the key forecasting factors obtained by screening is low; the runoff forecasting model is built based on the key forecasting factors obtained after screening, so that the forecasting precision of the built model is low.
In view of this, the embodiment of the invention provides a method for constructing a runoff forecasting model based on forecasting factor screening, which can be applied to a processor to construct the runoff forecasting model. According to the method provided by the invention, the test statistic information corresponding to each predictor is determined based on the second-order approximation value of the condition mutual information between different predictor data and runoff data, multiple hypothesis tests are carried out on the hypothesis of whether each predictor is a key predictor based on the pre-established hypothesis test, the pre-set test statistic distribution, the pre-set significance level and the pre-set correction rule, a target key predictor set is obtained, the reliability of the screened key predictor is higher, and the prediction precision of the runoff prediction model finally constructed based on the target key factor set is higher.
According to an embodiment of the present invention, there is provided an embodiment of a method for constructing a runoff forecasting model based on forecasting factor screening, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different from that herein.
In this embodiment, a method for constructing a runoff forecasting model based on forecasting factor screening is provided, which may be used in the above processor, and fig. 1 is a flowchart of a method for constructing a runoff forecasting model based on forecasting factor screening according to an embodiment of the present invention, as shown in fig. 1, where the flowchart includes the following steps:
step S101, obtaining a plurality of different forecasting factor data and runoff data.
Illustratively, the predictor may include, but is not limited to, a plurality of global hydrologic-meteorological factors such as atmospheric circulation, sea temperature index, etc.; the runoff data are runoff change data; in this embodiment of the present application, in order to implement mid-long term hydrologic forecasting, a first period corresponding to different forecasting factor data is different from a second period corresponding to runoff data, where the first period is earlier than the second period, for example: the first period may be the first half of a year and the second period may be the second half of a year. In the embodiment of the application, the continuous data of the original runoff and the original predictor can be discretized by using an equal width method to obtain a plurality of different predictor data and runoff data; for runoff Designating the number of boxes as +.>Will->Dividing into equal difference +.>Intervals. If the original runoffThe value of (2) falls within the interval +.>Within the range of->Is of heavy valueNew notation is->. For predictor->Designating the number of division intervals of each predictor as +.>Dividing all predictors into equal difference +.>Individual intervals, if->The value of (2) falls within the interval->Will->The values of (2) are marked as +.>
Step S102, second-order approximation values of the condition mutual information between different predictor data and runoff data are calculated respectively.
For example, a second-order approximation value of the condition mutual information between the corresponding predictor data and the runoff data can be calculated based on the second-order mutual information between the predictor data and the runoff data. In the embodiment of the application, given the screened key predictor index setsUnder the condition of->Complement of->Forecasting factor ∈to be screened>Is>Second order approximation of conditional mutual information +.>. Wherein (1)>,/>,/>The calculation mode of (2) is shown as follows:
wherein,is->Forecasting factors corresponding to subscript sets +.>And->The mutual information between the two pieces of information,is->Forecasting factors corresponding to subscript sets and +.>Mutual information between the two; />Is to consider->And- >Second-order mutual information of the mutual information of (a); />The calculation process of (2) is as follows:
wherein,representing the set of key predictor indices +.>Is +.>Representation->Complement of (a)Predictor data to be screened +.>Representing runoff data>Representation consideration->And->Is to be used for the interaction of informationSecond order mutual information>Representation->And->First-order mutual information between->Representation->And->First-order mutual information between->Representation->And->First-order mutual information between the two;
in order to consider the mutual information between the predictors and the runoffs at the same time, and also to simplify the calculation, record =/>To approximate substitution->Hereinafter abbreviated as->Is conditional mutual information->Is a second order approximation of +.>Can be calculated by the following formula:
the value of the condition mutual information corresponding to each forecasting factor can be calculated through the formula.
Step S103, based on second-order approximation values of the condition mutual information between different predictor data and runoff data, hypothesis tests respectively corresponding to different predictors are established, and test statistic information of hypothesis tests corresponding to each predictor is determined.
Illustratively, the hypothesis test and test statistic information corresponding to each predictor is determined based on second order approximations of conditional mutual information between different predictor data and runoff data. In the embodiment of the application, the test statistic information can be used To express, wherein->Is a predictor->Second order approximation of corresponding conditional mutual information, +.>The number of samples corresponding to the predictor data.
In the present embodiment, for a given set of key predictor indices,/>Other predictor co-factorsThe number of the two-dimensional space-saving type,for all ∈s to be screened>Individual predictor->,/>Multiple hypothesis testing, i.e., multiple hypothesis testing, is considered simultaneously, with the corresponding hypothesis testing being as follows:
original hypothesis: given->Under the condition of->And->Independent;
alternative hypothesis: given->Under the condition of->And->Is not independent.
Step S104, screening the key predictors contained in different predictors based on hypothesis testing corresponding to each predictor, testing statistic information of the hypothesis testing corresponding to each predictor, preset testing statistic distribution, preset significance level and preset correction rules to obtain a target key predictor set.
Illustratively, the predetermined test statistic distribution is when the original assumption is trueDistribution of test statistics; in this embodiment of the present application, the preset test statistic distribution may be chi-square distribution, the preset significance level may be determined according to actual requirements, and the preset correction rule may be bonferril correction algorithm (bonferriloni correction algorithm); in the embodiment of the application, the method and the device for screening the cell line are defined If the original hypothesis is wrongly rejected, namely the predictor is actually independent from the runoff, but is judged to be not independent, namely the predictor is wrongly judged to be a key predictor, the multiple hypothesis test is considered to have the problem of wrongly rejecting. Definitions->The number of the critical predictors which are wrongly judged in the individual predictors is +.>The total Type I Error Rate (FWER) at this point is +.>A probability of 1 or more, namely: />The method comprises the steps of carrying out a first treatment on the surface of the Control +.>Multiple hypothesis test is performed to obtain +.I.error rate of total I-type of multiple hypothesis test>Key predictors among the predictors.
In the embodiments of the present application, the following will be describedAs test statistics for each test, when the original assumption is true, i.e. given +.>Under the condition of->And->Independently, then->Subject to degree of freedom->Chi-square distribution of (i.e.)
Wherein the method comprises the steps of
In the aboveFor the collection->The number of elements in->Discrete value number of predictor +.>The number of discrete values for runoff.
Step S105, constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data.
In an embodiment of the present application, a runoff prediction model is constructed based on data of each key predictor in the target key predictor set and the runoff data, and the obtained runoff prediction model can accurately perform medium-and-long-term runoff prediction.
According to the runoff forecasting model construction method based on forecasting factor screening, multiple hypothesis tests are built for different forecasting factors based on second-order approximation values of condition mutual information between different forecasting factor data and runoff data, key forecasting factors in the different forecasting factors are screened based on the built hypothesis tests, test statistic information, preset test statistic distribution, preset significance level and preset correction rules, a target key forecasting factor set is obtained, the reliability of the screened key forecasting factors is higher, and the forecasting accuracy of the runoff forecasting model finally constructed based on the target key factor set is higher.
In this embodiment, a method for constructing a runoff forecasting model based on forecasting factor screening is provided, which may be used in the above processor, and fig. 2 is a flowchart of a method for constructing a runoff forecasting model based on forecasting factor screening according to an embodiment of the present invention, as shown in fig. 2, where the flowchart includes the following steps:
step S201, obtaining a plurality of different predictor data and runoff data. Please refer to step S101 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S202, second-order approximation values of the condition mutual information between different predictor data and runoff data are calculated respectively. Please refer to step S102 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S203, based on the second-order approximation value of the condition mutual information between the different predictor data and the runoff data, the hypothesis test corresponding to the different predictors is established, and the test statistic information of the hypothesis test corresponding to each predictor is determined. Please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S204, screening the key predictors contained in different predictors based on the hypothesis test corresponding to each predictor, the test statistic information of the hypothesis test corresponding to each predictor, the preset test statistic distribution, the preset significance level and the preset correction rule to obtain a target key predictor set. Please refer to step S104 in the embodiment shown in fig. 1 in detail, which is not described herein.
Specifically, the step S204 includes:
step S2041, determining a first distribution threshold based on a preset correction rule, a preset saliency level, and a number of different predictors.
In some alternative embodiments, step S2041 includes:
step a1, determining a target significance level based on a preset correction rule, a preset significance level and the number of a plurality of different predictors. Illustratively, the preset correction rule may be a bonafironi correction algorithm, and the preset significance level may be determined according to actual requirements; in the examples of the present application, to effectively control FWER of the multiple hypothesis test at this time, a significance level is given The bang-f-leni correction algorithm needs to control the significance level of each test to +.>I.e. the target significance level.
Step a2, determining a first distribution threshold based on the target salient information level and the preset test statistic distribution. Illustratively, the upper side of the chi-square distribution at this time may be calculated based on the target significance level, which may be determinedNumber of bits->The quantile value is a first distribution threshold.
And step S2042, comparing the test statistic information corresponding to each predictor with the first distribution threshold value to obtain comparison results corresponding to different predictors.
Illustratively, the first distribution threshold values of the test statistic information corresponding to each predictor are compared to determine a size relationship.
Step S2043, determining at least one key predictor from a plurality of different predictors based on the comparison results corresponding to the predictors, and incorporating the key predictor into the key predictor set.
In some alternative embodiments, step S2043 includes:
and b1, sequentially judging whether the test statistic information corresponding to different forecasting factors is larger than a first distribution threshold value. Exemplary, whether the test statistic information corresponding to the different predictors is greater than the first distribution threshold is sequentially determined.
And b2, if the test statistic information of the predictors is larger than the first distribution threshold, using the predictors as key predictors, and incorporating the key predictors into a key predictor set. Illustratively, a predictor with test statistic information greater than a first distribution threshold is used as a key predictor.
Step S2044, determining the remaining predictors based on the key predictors. The remaining predictors may be derived, for example, based on a plurality of different predictors and the key predictors that have been determined.
Step S2045 determines a second distribution threshold based on the preset correction rule, the preset saliency level, and the number of predictors remaining. The target saliency level is illustratively updated based on the number of predictors remaining and a preset saliency level, and a second distribution threshold is determined based on the updated target saliency level.
And step S2046, comparing the test statistic information of each residual predictor with a second distribution threshold value to obtain comparison results respectively corresponding to different residual predictors.
Step S2047, determining at least one key predictor in the residual predictors based on the comparison results respectively corresponding to the residual predictors, and incorporating the key predictors into the key predictor set. Illustratively, whether the corresponding predictor is a key predictor is determined based on the comparison results respectively corresponding to the remaining predictors.
Step S2048, returning to the step of determining the residual predictors until no key predictors exist in the residual predictors, and obtaining a target key factor set.
In particular, in the embodiment of the present application, ifThen accept the original hypothesisI.e. +.>And->Independent->Can not be regarded as->Key predictors of (a);
if it isReject the original hypothesis +.>I.e. +.>And->Not independent, then->Can be used as a key predictor.
The original assumption of all refusals is recorded as K, namelyMultiple hypothesis test that the number of predictors determined as key predictors is K, and record the K predictorsThe subscript set of the reporting factors is +.>
If it isStopping the screening of the key predictor, and selecting +.>As a final set of key forecasting factors;
if it isWill->As a new key predictor subscript set to be received into the key predictor subscript set, update +.>Let->The method comprises the steps of carrying out a first treatment on the surface of the Repeatedly screening the key predictors to finally obtain a key predictor subscript set +.>Corresponding->Then the key predictors required for the runoff forecast are used.
And step S205, constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data. Please refer to step S105 in the embodiment shown in fig. 1 in detail, which is not described herein.
In this embodiment, a method for constructing a runoff forecasting model based on forecasting factor screening is provided, which may be used in the above processor, and fig. 3 is a flowchart of a method for constructing a runoff forecasting model based on forecasting factor screening according to an embodiment of the present invention, as shown in fig. 3, where the flowchart includes the following steps:
step S301, obtaining a plurality of different predictor data and runoff data. Please refer to step S101 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S302, second-order approximation values of the condition mutual information between different predictor data and runoff data are calculated respectively. Please refer to step S102 in the embodiment shown in fig. 1 in detail, which is not described herein.
Specifically, the step S302 includes:
step S3021, calculating a first frequency estimated density function value corresponding to each of the predictor data and the runoff data.
Step S3022, calculating second frequency estimated density function values corresponding to each of the predictor data and the other predictor data, respectively.
In step S3023, a third frequency estimated density function value among each of the predictor data, other predictor data, and runoff data is calculated, the other predictors being used to characterize predictors other than the current predictor among the plurality of different predictors.
Step S3024, determining second-order approximation values of the condition mutual information between the different predictor data and the runoff data based on the first frequency estimated density function, the second frequency estimated density function and the third frequency estimated density function respectively corresponding to the different predictors.
Illustratively, in the embodiment of the application, the density function value can be calculated based on the frequency estimationAndthe calculation process is as follows:
wherein,is a predictor->And predictor->A first frequency between the two estimates of the density function value,is a predictor->And runoff->A second frequency estimated density function value between +.>Is a predictor->Predictor->Runoff->The third frequency between the two values is used for estimating the density function, and the meanings of other variables are the same;
wherein the calculation formula of the frequency estimation density function is as follows:
wherein,the number of samples is indicated, and the meaning of other variables is referred to the description of the corresponding content above, and will not be repeated here.
For discrete valued predictor data and runoff data, calculating frequency estimation density functionAnd +.>Further calculate the first order mutual information +.>And +.>Finally calculate +.>
Step S303, based on the second-order approximation value of the condition mutual information between the different predictor data and the runoff data, the hypothesis test corresponding to the different predictors is established, and the test statistic information of the hypothesis test corresponding to each predictor is determined. Please refer to step S103 in the embodiment shown in fig. 1 in detail, which is not described herein.
Step S304, screening the key predictors contained in different predictors based on hypothesis testing corresponding to each predictor, testing statistic information of the hypothesis testing corresponding to each predictor, preset testing statistic distribution, preset significance level and preset correction rules to obtain a target key predictor set. Please refer to step S104 in the embodiment shown in fig. 1 in detail, which is not described herein.
And S305, constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data. Please refer to step S105 in the embodiment shown in fig. 1 in detail, which is not described herein.
Specifically, the step S305 includes:
and step S3051, associating each key predictor data in the target key predictor set with the runoff data to obtain an associated data set.
And step S3052, training the preset model based on the associated data set until the accuracy requirement of the preset model is met, and obtaining the runoff forecasting model. Illustratively, in embodiments of the present application, the preset model may include, but is not limited to, a machine learning model.
In this embodiment, a method for predicting runoff is provided, which may be used in the above processor, and fig. 4 is a flowchart of a method for constructing a runoff prediction model based on predictor screening according to an embodiment of the present invention, as shown in fig. 4, where the flowchart includes the following steps:
Step S401, acquiring different key predictor data of the target. For example, in the embodiment of the present application, the data corresponding to the different key predictors of the target period may be the data corresponding to the key predictors of the target period, and the target period may be any historical period.
Step S402, inputting different key predictor data of targets into a pre-constructed runoff prediction model, so that the runoff prediction model outputs corresponding runoff change information, and the runoff prediction model is constructed by the runoff prediction model construction method based on predictor screening in the embodiment. The key predictor data of the target period is input to the runoff forecasting model, so that the model outputs runoff change information of the period to be predicted.
The embodiment also provides a runoff forecasting model construction device based on forecasting factor screening, which is used for realizing the embodiment and the preferred implementation mode, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
The embodiment provides a runoff forecasting model construction device based on forecasting factor screening, as shown in fig. 5, including:
a first obtaining module 501, configured to obtain a plurality of different predictor data and runoff data;
the calculating module 502 is configured to calculate second-order approximations of conditional mutual information between different predictor data and runoff data respectively;
a first determining module 503, configured to establish hypothesis tests corresponding to different predictors respectively based on second-order approximations of condition mutual information between different predictor data and runoff data, and determine test statistic information of hypothesis tests corresponding to each predictor;
the screening module 504 is configured to screen the key predictors included in the different predictors based on the hypothesis test corresponding to each predictor, the test statistic information of the hypothesis test corresponding to each predictor, the preset test statistic distribution, the preset significance level, and the preset correction rule, to obtain a target key predictor set;
the construction module 505 constructs a runoff forecasting model based on the set of target key forecasting factors and the runoff data.
In some alternative embodiments, the screening module 504 includes:
A first determining sub-module for determining a first distribution threshold based on a preset correction rule, a preset significance level, and a number of different predictors;
the first comparison sub-module is used for comparing the test statistic information corresponding to each predictor with a first distribution threshold value to obtain comparison results corresponding to different predictors;
the second determining submodule is used for determining at least one key predictor in a plurality of different predictors based on the comparison results corresponding to the predictors and incorporating the key predictor into the key predictor set;
a third determining sub-module for determining remaining predictors based on the key predictors;
a fourth determining sub-module for determining a second distribution threshold based on a preset correction rule, a preset significance level, and a number of remaining predictors;
the second comparison sub-module is used for comparing the test statistic information of each residual predictor with a second distribution threshold value to obtain comparison results respectively corresponding to different residual predictors;
a fifth determining submodule, configured to determine at least one key predictor from the remaining predictors based on the comparison results corresponding to the remaining predictors, and incorporate the key predictor into the key predictor set;
And a sixth determining submodule, configured to return to the step of determining the remaining predictors until no key predictors exist in the remaining predictors, thereby obtaining a target key factor set.
In some alternative embodiments, the first determination submodule includes:
a first determining unit configured to determine a target significance level based on a preset correction rule, a preset significance level, and a number of different predictors;
and the second determining unit is used for determining the first distribution threshold value based on the target significant information level and the preset test statistic distribution.
In some alternative embodiments, the build module 505 includes:
the association sub-module is used for associating each key predictor data in the target key predictor set with the runoff data to obtain an association data set;
and the training sub-module is used for training the preset model based on the associated data set until the accuracy requirement of the preset model is met, and obtaining the runoff forecasting model.
In some alternative embodiments, the computing module includes:
the first calculating sub-module is used for calculating a first frequency estimated density function value corresponding to each predictor data and each runoff data;
The second calculation sub-module is used for calculating a second frequency estimated density function value corresponding to each piece of forecasting factor data and other pieces of forecasting factor data respectively;
a third calculation sub-module for calculating a third frequency estimated density function value among each of the predictor data, other predictor data, and runoff data, the other predictors being used for characterizing predictors other than the current predictor among the plurality of different predictors;
and the seventh determining submodule is used for determining second-order approximate values of the condition mutual information between the different predictor data and the runoff data based on the first frequency estimated density function, the second frequency estimated density function and the third frequency estimated density function which are respectively corresponding to the different predictors.
In some alternative embodiments, the second determination submodule includes:
the judging unit is used for judging whether the test statistic information corresponding to the different predictors is larger than a first distribution threshold value or not by using the surplus in sequence;
and the determining unit is used for taking the predictor as a key predictor and incorporating the key predictor into the key predictor set if the test statistic information of the predictor is larger than the first distribution threshold value.
The present embodiment provides a runoff forecasting apparatus, as shown in fig. 6, including:
a second obtaining module 601, configured to obtain different key predictor data of the target;
the second determining module 602 is configured to input the target different key predictors into a pre-constructed runoff prediction model, so that the runoff prediction model outputs corresponding runoff variation information, and the runoff prediction model is constructed by the runoff prediction model construction method based on predictor screening in the above embodiment.
Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.
The runoff forecasting model construction device or the runoff forecasting device based on forecasting factor screening in the present embodiment is presented in the form of functional units, where the units refer to ASIC (Application Specific Integrated Circuit ) circuits, processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above functions.
The embodiment of the invention also provides a computer device which is provided with the runoff forecasting model construction device based on the forecasting factor screening shown in the figure 5 or the runoff forecasting device shown in the figure 6.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 7, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 7.
The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.
Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform a method for implementing the embodiments described above.
The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.
The computer device also includes a communication interface 30 for the computer device to communicate with other devices or communication networks.
The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.
Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims (13)

1. The runoff forecasting model construction method based on forecasting factor screening is characterized by comprising the following steps:
acquiring a plurality of different predictor data and runoff data;
respectively calculating second-order approximation values of condition mutual information between different forecasting factor data and the runoff data;
establishing hypothesis tests respectively corresponding to different predictors based on second-order approximation values of condition mutual information between different predictor data and the runoff data, and determining test statistic information of the hypothesis tests corresponding to each predictor;
screening key predictors contained in different predictors based on hypothesis tests corresponding to the predictors, test statistic information of the hypothesis tests corresponding to the predictors, preset test statistic distribution, preset significance level and preset correction rules to obtain a target key predictor set;
constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data;
The step of calculating second-order approximations of the conditional mutual information between the different predictor data and the runoff data respectively comprises the following steps:
the second-order approximation of the mutual information of the conditions between the predictor data and the runoff data is determined by the following formula:
wherein,representing the set of key predictor indices screened,/->Representing the set of key predictor indices +.>Is +.>Representation->Complement of->Predictor data to be screened +.>Representing the data of the runoff quantity,representation->Complement of->Predictor data to be screened +.>Data of runoff->Second order approximation of conditional mutual information between +.>,1,…,/>,/>,/>;/>Is to consider->And->Second order mutual information of the mutual information of +.>Representation->And->First-order mutual information between the two;
determined by the following formula:
wherein,is a predictor->And predictor->A first frequency estimated density function value between +.>Is a predictor->And runoff->A second frequency estimated density function value between +.>Is a predictor->Predictor->Runoff->A third frequency estimate density function value between +.>Is a predictor->Is a frequency estimation density function value;
the step of calculating second-order approximation values of the condition mutual information between different predictor data and the runoff data respectively further comprises the following steps:
Calculating a first frequency estimated density function value corresponding to each piece of forecasting factor data and the runoff data;
calculating a second frequency estimated density function value corresponding to each piece of predictor data and other pieces of predictor data respectively;
calculating third frequency estimated density function values among each predictor data, other predictor data and the runoff data, wherein the other predictors are predictors except the current predictor in the plurality of predictors;
and determining second-order approximate values of the condition mutual information between the different predictor data and the runoff data based on the first frequency estimated density function, the second frequency estimated density function and the third frequency estimated density function which are respectively corresponding to the different predictors.
2. The method according to claim 1, wherein the step of screening the key predictors included in the different predictors to obtain the target set of key predictors based on the hypothesis test corresponding to each predictor, the test statistic information of the hypothesis test corresponding to each predictor, the preset test statistic distribution, the preset significance level, and the preset correction rule, comprises:
Determining a first distribution threshold based on a preset correction rule, a preset significance level, and a number of different predictors;
comparing the test statistic information corresponding to each predictor with the first distribution threshold value to obtain comparison results corresponding to different predictors;
determining at least one key predictor in a plurality of different predictors based on the comparison results corresponding to the predictors, and incorporating the key predictor into a key predictor set;
determining remaining predictors based on the key predictors;
determining a second distribution threshold based on a preset correction rule, a preset significance level, and a remaining number of predictors;
comparing the test statistic information of each residual predictor with a second distribution threshold value to obtain comparison results respectively corresponding to different residual predictors;
determining at least one key predictor in the residual predictors based on the comparison results corresponding to the residual predictors respectively, and incorporating the key predictors into a key predictor set;
and returning to the step of determining the residual predictors until no key predictors exist in the residual predictors, and obtaining a target key factor set.
3. The method of claim 1, wherein determining the first distribution threshold based on the preset correction rule, the preset saliency level, and the number of the plurality of different predictors comprises:
determining a target significance level based on a preset correction rule, a preset significance level, and a number of different predictors;
a first distribution threshold is determined based on the target salient information level and the preset test statistic distribution.
4. The method of claim 1, wherein the step of constructing a runoff forecasting model based on the set of target key forecasting factors and the runoff data comprises:
correlating each key predictor data in the target key predictor set with the runoff data to obtain a correlation data set;
training the preset model based on the associated data set until the accuracy requirement of the preset model is met, and obtaining the runoff forecasting model.
5. The method according to claim 2, wherein the step of determining at least one key predictor among a plurality of different predictors based on the comparison result for each predictor and incorporating the key predictor into the set of key predictors comprises:
Sequentially judging whether the test statistic information corresponding to different forecasting factors is larger than the first distribution threshold value or not;
and if the test statistic information of the predictors is greater than the first distribution threshold, using the predictors as key predictors, and incorporating the key predictors into a key predictor set.
6. A method of runoff forecasting, the method comprising:
acquiring different key forecasting factor data of a target;
inputting different key predictors of the target into a pre-constructed runoff prediction model, so that the runoff prediction model outputs corresponding runoff change information, and the runoff prediction model is constructed by the runoff prediction model construction method based on predictor screening according to any one of claims 1 to 5.
7. A runoff forecasting model construction device based on forecasting factor screening, characterized in that the device comprises:
the first acquisition module is used for acquiring a plurality of different forecasting factor data and runoff data;
the calculation module is used for respectively calculating second-order approximate values of the condition mutual information between different forecasting factor data and the runoff data;
The second-order approximation of the mutual information of the conditions between the predictor data and the runoff data is determined by the following formula:
wherein,representing the set of key predictor indices screened,/->Representing the set of key predictor indices +.>Any of the followingA predictor data->Representation->Complement of->Predictor data to be screened +.>Representing the data of the runoff quantity,representation->Complement of->Predictor data to be screened +.>Data of runoff->Second order approximation of conditional mutual information between +.>,1,…,/>,/>,/>;/>Is to consider->And->Second order mutual information of the mutual information of +.>Representation->And->First-order mutual information between the two;
determined by the following formula:
wherein,is a predictor->And predictor->A first frequency estimated density function value between +.>Is a predictor->And runoff->A second frequency estimated density function value between +.>Is a predictor->Predictor->Runoff->A third frequency estimate density function value between +.>Is a predictor->Is a frequency estimation density function value;
the first determining module is used for establishing hypothesis tests respectively corresponding to different predictors based on second-order approximation values of condition mutual information between different predictor data and the runoff data, and determining test statistics of the hypothesis tests corresponding to each predictor;
The screening module is used for screening the key predictors contained in different predictors based on hypothesis tests corresponding to the predictors, test statistic information of the hypothesis tests corresponding to the predictors, preset test statistic distribution, preset significance level and preset correction rules to obtain a target key predictor set;
the construction module is used for constructing a runoff forecasting model based on the target key forecasting factor set and the runoff data;
the computing module includes:
a first calculation sub-module for calculating a first frequency estimated density function value corresponding between each predictor data and the runoff data;
the second calculation sub-module is used for calculating a second frequency estimated density function value corresponding to each piece of forecasting factor data and other pieces of forecasting factor data respectively;
a third calculation sub-module for calculating a third frequency estimated density function value among each predictor data, other predictor data, and the runoff data, the other predictors being used for characterizing predictors other than the current predictor among a plurality of different predictors;
and the seventh determining submodule is used for determining second-order approximate values of the condition mutual information between the different predictor data and the runoff data based on the first frequency estimated density function, the second frequency estimated density function and the third frequency estimated density function which are respectively corresponding to the different predictors.
8. The apparatus of claim 7, wherein the screening module comprises:
a first determining sub-module for determining a first distribution threshold based on a preset correction rule, a preset significance level, and a number of different predictors;
the first comparison sub-module is used for comparing the test statistic information corresponding to each predictor with the first distribution threshold value to obtain comparison results corresponding to different predictors;
the second determining submodule is used for determining at least one key predictor in a plurality of different predictors based on the comparison results corresponding to the predictors and incorporating the key predictor into a key predictor set;
a third determining sub-module for determining remaining predictors based on the key predictors;
a fourth determining sub-module for determining a second distribution threshold based on a preset correction rule, a preset significance level, and a number of remaining predictors;
the second comparison sub-module is used for comparing the test statistic information of each residual predictor with a second distribution threshold value to obtain comparison results respectively corresponding to different residual predictors;
A fifth determining submodule, configured to determine at least one key predictor from the remaining predictors based on the comparison results corresponding to the remaining predictors, and incorporate the key predictor into a key predictor set;
and a sixth determining submodule, configured to return to the step of determining the remaining predictors until no key predictors exist in the remaining predictors, thereby obtaining a target key factor set.
9. The apparatus of claim 8, wherein the first determination submodule comprises:
a first determining unit configured to determine a target significance level based on a preset correction rule, a preset significance level, and a number of different predictors;
and the second determining unit is used for determining a first distribution threshold value based on the target significant information level and the preset test statistic distribution.
10. The apparatus of claim 7, wherein the build module comprises:
the association sub-module is used for associating each key predictor data in the target key predictor set with the runoff data to obtain an association data set;
and the training sub-module is used for training the preset model based on the associated data set until the accuracy requirement of the preset model is met, so as to obtain the runoff forecasting model.
11. A runoff forecasting device, the device comprising:
the second acquisition module is used for acquiring different key predictor data of the target;
the second determining module is used for inputting the different key predictors of the targets into a pre-constructed runoff forecasting model, so that the runoff forecasting model outputs corresponding runoff change information, and the runoff forecasting model is constructed by the runoff forecasting model construction method based on the predictors screening according to any one of claims 1 to 5.
12. A computer device, comprising:
a memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the method of constructing a runoff forecasting model based on forecasting factor screening of any one of claims 1 to 5, or to perform the method of forecasting runoff of claim 6.
13. A computer-readable storage medium having stored thereon computer instructions for causing a computer to execute the method for constructing a runoff forecasting model based on forecasting factor screening according to any one of claims 1 to 5.
CN202311371134.XA 2023-10-23 2023-10-23 Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening Active CN117132176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311371134.XA CN117132176B (en) 2023-10-23 2023-10-23 Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311371134.XA CN117132176B (en) 2023-10-23 2023-10-23 Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening

Publications (2)

Publication Number Publication Date
CN117132176A CN117132176A (en) 2023-11-28
CN117132176B true CN117132176B (en) 2024-01-26

Family

ID=88852949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311371134.XA Active CN117132176B (en) 2023-10-23 2023-10-23 Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening

Country Status (1)

Country Link
CN (1) CN117132176B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463993A (en) * 2017-08-04 2017-12-12 贺志尧 Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks
CN110245773A (en) * 2019-03-26 2019-09-17 国家气象中心 A kind of method that multi-source fact space-time predictor extracted and be included in interpretation of scheme application
CN113255986A (en) * 2021-05-20 2021-08-13 大连理工大学 Multi-step daily runoff forecasting method based on meteorological information and deep learning algorithm

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230289381A1 (en) * 2020-02-21 2023-09-14 Brian MCCARSON Deriving leading indicators of economic activity using predictive analytics applied to agricultural, mining, construction, and environmental attributes to predict ecological trends and economic outcomes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463993A (en) * 2017-08-04 2017-12-12 贺志尧 Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks
CN110245773A (en) * 2019-03-26 2019-09-17 国家气象中心 A kind of method that multi-source fact space-time predictor extracted and be included in interpretation of scheme application
CN113255986A (en) * 2021-05-20 2021-08-13 大连理工大学 Multi-step daily runoff forecasting method based on meteorological information and deep learning algorithm

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
基于Copula熵的神经网络径流预报模型预报因子选择;陈璐;叶磊;卢韦伟;周建中;郭生练;肖舸;陈健国;;水力发电学报(06);全文 *
基于互信息的软测量变量选择;杨慧中;章军;陶洪峰;;控制工程(04);全文 *
基于偏互信息法遴选因子的长江中长期径流预报;麦紫君;曾小凡;周建中;叶磊;何奇芳;;人民长江(03);全文 *
基于联合互信息的水文预报因子集选取研究;纪昌明;俞洪杰;阎晓冉;李荣波;王丽萍;;水力发电学报(08);全文 *
耦合偏互信息的贫资料地区小水电发电能力预测;刘本希;廖胜利;冯仲恺;程春田;李秀峰;蔡华祥;;电力系统自动化(19);全文 *

Also Published As

Publication number Publication date
CN117132176A (en) 2023-11-28

Similar Documents

Publication Publication Date Title
CN111950738B (en) Machine learning model optimization effect evaluation method, device, terminal and storage medium
CN111797320A (en) Data processing method, device, equipment and storage medium
CN117078048A (en) Digital twinning-based intelligent city resource management method and system
CN112801434A (en) Method, device, equipment and storage medium for monitoring performance index health degree
CN112632179A (en) Model construction method and device, storage medium and equipment
CN112131274A (en) Method, device and equipment for detecting time series abnormal points and readable storage medium
CN107315671B (en) Application state monitoring method, device and equipment
CN112365156B (en) Data processing method, data processing device, terminal and storage medium
CN114330090A (en) Defect detection method and device, computer equipment and storage medium
CN117132176B (en) Runoff forecasting model construction and runoff forecasting method based on forecasting factor screening
CN117235608A (en) Risk detection method, risk detection device, electronic equipment and storage medium
CN116340883A (en) Power distribution network data resource fusion method, device, equipment and storage medium
CN117132177B (en) Runoff forecasting model construction and runoff forecasting method based on multiple hypothesis test
CN115481694B (en) Data enhancement method, device and equipment for training sample set and storage medium
CN116822366A (en) Construction of runoff pollution load calculation model and runoff pollution load calculation method
CN117114523B (en) Runoff forecasting model construction and runoff forecasting method based on condition mutual information
CN116011677A (en) Time sequence data prediction method and device, electronic equipment and storage medium
CN112528500B (en) Evaluation method and evaluation equipment for scene graph construction model
CN115146997A (en) Evaluation method and device based on power data, electronic equipment and storage medium
CN114679335A (en) Network security risk assessment training and assessment method and equipment for power monitoring system
CN111861798A (en) Residential electricity data missing value interpolation method based on neighbor algorithm
CN116843203B (en) Service access processing method, device, equipment, medium and product
CN112799913B (en) Method and device for detecting abnormal operation of container
CN116432776A (en) Training method, device, equipment and storage medium of target model
CN117195104A (en) Resource classification method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant