CN112258251A - Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand - Google Patents

Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand Download PDF

Info

Publication number
CN112258251A
CN112258251A CN202011294838.8A CN202011294838A CN112258251A CN 112258251 A CN112258251 A CN 112258251A CN 202011294838 A CN202011294838 A CN 202011294838A CN 112258251 A CN112258251 A CN 112258251A
Authority
CN
China
Prior art keywords
training
training set
sample
day
base
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011294838.8A
Other languages
Chinese (zh)
Other versions
CN112258251B (en
Inventor
张玉利
于浩洁
梁熙栋
张倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Technology BIT
Original Assignee
Beijing Institute of Technology BIT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Technology BIT filed Critical Beijing Institute of Technology BIT
Priority to CN202011294838.8A priority Critical patent/CN112258251B/en
Publication of CN112258251A publication Critical patent/CN112258251A/en
Application granted granted Critical
Publication of CN112258251B publication Critical patent/CN112258251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/70Energy storage systems for electromobility, e.g. batteries
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/60Other road transportation technologies with climate change mitigation effect
    • Y02T10/7072Electromobility specific charging systems or methods for batteries, ultracapacitors, supercapacitors or double-layer capacitors

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Water Supply & Treatment (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Artificial Intelligence (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an integrated learning prediction method and system for an electric vehicle battery replacement requirement based on grey correlation, which comprises the following steps: constructing a data set and preprocessing the data set, and dividing the preprocessed data set into a training set and a testing set; selecting k base learners, and training and predicting samples of a training set by each base learner in a cross validation mode; for each input sample in the test set, selecting the best similar day training set through grey correlation analysis; establishing a prediction deviation minimization optimization model according to the prediction result of each base learner in the training set of the optimal similar days, and adopting an L1 norm with a weight coefficient as a regular term; and solving the obtained weight coefficient of each base learner based on the optimization model to obtain an integrated predictor, and obtaining an integrated learning prediction result based on the integrated predictor. The method can effectively reduce the prediction deviation, has better prediction effect on data with high random fluctuation, and can be more suitable for the data set acquired in practice.

Description

Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand
Technical Field
The invention relates to the technical field of machine learning, in particular to an integrated learning prediction method and system for an electric vehicle battery replacement requirement based on grey correlation.
Background
The automobile is a daily trip mode of people, but the traditional fuel oil automobile can bring serious environmental pollution problems, such as pollution of the atmosphere and water resources, global warming and the like. And the appearance of the electric automobile can reduce the use of the traditional fossil energy, further reduce the emission of pollutants, and play a certain role in protecting the environment. The battery replacement mode of the electric automobile can reduce the charging time and improve the convenience of users. For example, in 2017, beijing automobile industry consortium limited (BAIC) announced implementation of "Optimus prime's program" aiming at promoting integrated development of new energy and electric vehicles through a battery exchange model. The BAIC project built 3000 optical storage switching stations before the end of 2022.
Although the battery replacement mode of the electric vehicle has many advantages compared with the charging mode, the popularity of the battery replacement mode is far lower than that of the charging mode at present. The main reason is that infrastructure construction such as power station replacement is imperfect in China at present, so that electric automobile users often cannot find the power station replacement in time to replace the power. In addition, the unreasonable management and operation of the battery replacement station on the battery also become a barrier for the development of the battery replacement mode of the electric automobile. Due to the fact that operators of the battery replacement station lack knowledge of changes of customer quantity or battery requirements in a short period of time in the future, the problem that the quantity of supplied batteries is insufficient or the batteries are queued for charging often occurs, the batteries of the electric automobile cannot be replaced in time, and therefore the satisfaction degree of users is greatly lowered, especially users sensitive to time.
In order to improve the service level and the battery charging efficiency, an operator of a battery replacement station needs to accurately predict the battery replacement requirement of the electric vehicle, and therefore the battery replacement requirement of the electric vehicle needs to be accurately predicted. There are three main prediction methods, including simulation analysis based on monte carlo, time series analysis, and machine learning method. Machine learning has very high prediction accuracy and is widely applied to various fields.
For a single type of machine learning method (commonly referred to as a base learner), there is a bias in its operation at the beginning of the design, and its prediction accuracy is low on datasets that it does not adapt to. In order to overcome the defects of a single predictor, an integrated prediction method is gradually appeared, namely, an integrated predictor is constructed by combining a plurality of base learners to improve the prediction accuracy. At present, the integration prediction methods include voting (voting), bagging (bagging), boosting (boosting), stacking (stacking), and the like. The actually obtained data set is not uniformly distributed, and has larger volatility and extremely strong uncertainty if the electric automobile needs to be replaced.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an integrated learning prediction method and system for the battery replacement requirement of the electric vehicle based on grey correlation, which adopt an integrated prediction method similar to a stacking method to improve the generalization of a model and the accuracy of a prediction result.
The invention discloses an integrated learning prediction method for an electric vehicle battery replacement requirement based on grey correlation, which comprises the following steps of:
constructing a data set and preprocessing the data set, and dividing the preprocessed data set into a training set and a testing set;
selecting k base learners, and training and predicting samples of a training set by each base learner in a cross validation mode;
for each input sample in the test set, selecting the best similar day training set through grey correlation analysis;
establishing a prediction deviation minimization optimization model according to the prediction result of each base learner in the training set of the optimal similar days, and adopting an L1 norm with a weight coefficient as a regular term;
solving the obtained weight coefficient of each base learner based on the optimization model to obtain an integrated predictor, and obtaining an integrated learning prediction result based on the integrated predictor; wherein the output of the integrated predictor is a linear weighted combination of the outputs of the basis learners.
As a further refinement of the invention, the dataset is T { (x)1,Y1),...,(xn,Yn)},
Figure BDA0002785076530000021
Where T is the data set, xiTaking 1, n, n as the number of samples for the ith sample;
Figure BDA0002785076530000022
taking 1, m and m as feature numbers for j which is the jth feature of the sample i; y isiAnd (3) taking 1, namely, n as a label of a sample i, namely the battery replacement demand of the electric automobile.
As a further refinement of the invention, the features include:
x(1)for weeks, codes from 1 to 7;
x(2)if the number is weekend, the number is 1 if the number is weekend, otherwise, the number is 0;
x(3)the weather is divided into sunny days, cloudy days, rainy days or snowy days, and the weather is respectively coded as 1,2 and 3;
x(4)the highest air temperature value of the day;
x(5)the lowest air temperature value of the current day;
x(6)the battery replacement demand of all the electric automobiles on the same day of the last week;
x(7)predicting the battery replacement demand of all the electric vehicles on the day before the day;
x(8)predicting the driving mileage of all electric automobiles one day before the day;
x(9)~x(13)the remaining battery capacity (SOC) was in the interval [0, 20% ] when all vehicles finished traveling the day before the predicted day]、[20%,40%]、[40%,60%]、[60%,80%]、[80%,100%]The vehicle (c) accounts for the proportion of all electric vehicles.
As a further development of the invention, the preprocessing of the data set comprises: standardizing data, and then performing dimensionality reduction by adopting PCA (principal component analysis);
the training set comprises a training subset and a verification set, wherein the training subset is 70% of the total data volume, and the verification set and the test set are 15% of the total data volume;
the basis learners include K-nearest neighbors (KNN), support vector machines (SVR), Gradient Boosting Regression Trees (GBRT), Random Forests (RF), and Ridge Regression (RR).
As a further improvement of the present invention, the training and predicting samples of the training set by each base learner in a cross-validation manner includes:
adopting six-fold cross validation;
averagely dividing the training set into 6 parts, namely T1, T2, T3, T4, T5 and T6, taking 5 parts as a training subset to train a base learner, and taking the other part as a verification set to predict by using the base learner;
after multiple times of training and prediction, obtaining the prediction result of each base learner in the training set; wherein f isr(. to) is the r-th base learner, r is 1,2,3,4,5, and the predicted result f of the 5 base learners in the training set can be obtained through a cross validation moder(xi);
All base learners are trained on the entire training subset.
As a further refinement of the present invention, the R (i) most relevant to the ith sample in the training set is selected using a gray correlation analysis1,i2,…,iR) Day, as training set T of similar daysi
Figure BDA0002785076530000031
In the formula (I), the compound is shown in the specification,
Figure BDA0002785076530000032
the prediction result of the kth base learner for the R-th sample i.
As a further improvement of the present invention, the grey correlation analysis comprises:
first, the gray correlation coefficient ([ xi ]) is calculated0i) And then calculates a gray correlation degree (gamma)0i) For an input test set sample x0And samples x in the training setiThe gray correlation coefficient is calculated by the formula:
Figure BDA0002785076530000033
In which ξ0i(c) Sample x for the test set0And samples x in the training setiIn the grey correlation coefficient of the c-th feature,
Figure BDA0002785076530000041
Figure BDA0002785076530000042
after calculating the gray correlation coefficient of each feature of the input test set sample and all samples in the training set, the gray correlation degree of each sample in the input test set sample and training set needs to be calculated, and the formula is
Figure BDA0002785076530000043
Taking the average value of each gray correlation coefficient, wherein the larger the calculated gray correlation value is, the higher the correlation is; and selecting the most relevant R samples from the training set as a similar day training set according to the calculated gray correlation degree.
As a further improvement of the invention, the optimization model is as follows:
Figure BDA0002785076530000044
this formula can be equated to the following linear program:
Figure BDA0002785076530000045
s.t.α≥0
Figure BDA0002785076530000046
Figure BDA0002785076530000047
Figure BDA0002785076530000048
in the formula, E2]The method is used for solving expectation of a formula in brackets, and is used for solving an average value under a discrete condition; y is a random variable, i.e. the actual predicted value of a sample in the training set, YiIs the actual value of the ith sample; f ═ f1,...,fk) I.e. the predicted values of the k base predictors at the samples corresponding to Y, | α | respectively1=|α1|+...+|αkI, w is a weight coefficient, and k is the number of the base learners; z and viSolving the optimization model for the introduced intermediate variables can obtain a decision variable alpha ═ alpha (alpha)1,..,αk)。
As a further improvement of the present invention, for a certain test set sample xiFinal prediction result F (x)i)=α1f1(xi)+...+αkfk(xi)。
The invention also discloses a prediction system for realizing the integrated learning prediction method, which comprises the following steps:
the building module is used for building a data set, preprocessing the data set and dividing the preprocessed data set into a training set and a testing set;
the training module is used for selecting k base learners, and each base learner is used for training and predicting samples of a training set in a cross validation mode;
the analysis module is used for selecting the best similar day training set of each input sample in the test set through grey correlation analysis;
the building module is used for building a prediction deviation minimization optimization model according to the prediction result of each base learner in the training set of the optimal similar day, and an L1 norm with a weight coefficient is used as a regular term;
the prediction module is used for solving the obtained weight coefficient of each base learner based on the optimization model to obtain an integrated predictor and obtaining an integrated learning prediction result based on the integrated predictor; wherein the output of the integrated predictor is a linear weighted combination of the outputs of the basis learners.
Compared with the prior art, the invention has the beneficial effects that:
the invention considers the prediction accuracy and generalization of the integrated predictor, can effectively reduce the prediction deviation compared with the base learner with the best prediction effect, has better prediction effect on data with high random fluctuation, can better adapt to the data set acquired in practice, can be applied to other data sets, and has stronger practicability.
Drawings
Fig. 1 is a flowchart of an integrated learning prediction method for an electric vehicle battery replacement demand based on gray correlation according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The invention provides an integrated learning prediction method and system for an electric vehicle battery replacement demand based on grey correlation, belonging to the technical field of machine learning; the method involves a two-layered structure, namely a plurality of base learners and an integrated predictor, which is a weighted combination of the plurality of base learners. In order to improve the prediction accuracy, the optimal similar day training set of each input prediction sample is selected based on gray correlation analysis, and then the most relevant training set is provided for solving the weight of the integrated predictor so as to improve the prediction accuracy of the model. In order to improve the generalization of the integrated predictor, the invention establishes an optimization model with a weighted L1 norm regularization term, the optimization model is equivalent to a linear programming problem, and the weight coefficients are solved through the optimization model. The invention considers the prediction accuracy and generalization of the integrated predictor, can effectively reduce the prediction deviation compared with the base learner with the best prediction effect, has better prediction effect on data with high random fluctuation, can better adapt to the data set acquired in practice, can be applied to other data sets, and has stronger practicability.
The invention is described in further detail below with reference to the attached drawing figures:
as shown in fig. 1, the invention provides an integrated learning prediction method for an electric vehicle battery replacement demand based on gray correlation, which includes:
step 1, selecting characteristics, constructing a data set, preprocessing the data set, and dividing the preprocessed data set into a training set and a testing set; wherein the content of the first and second substances,
for the electric automobile record data that gather, its characterized in that includes:
x(1)for weeks, codes from 1 to 7;
x(2)if the number is weekend, the number is 1 if the number is weekend, otherwise, the number is 0;
x(3)the weather is divided into sunny days, cloudy days, rainy days or snowy days, and the weather is respectively coded as 1,2 and 3;
x(4)the highest air temperature value of the day;
x(5)the lowest air temperature value of the current day;
x(6)the battery replacement demand of all the electric automobiles on the same day of the last week;
x(7)predicting the battery replacement demand of all the electric vehicles on the day before the day;
x(8)predicting the driving mileage of all electric automobiles one day before the day;
x(9)~x(13)the remaining battery capacity (SOC) was in the interval [0, 20% ] when all vehicles finished traveling the day before the predicted day]、[20%,40%]、[40%,60%]、[60%,80%]、[80%,100%]The vehicle (2) accounts for the proportion of all electric automobiles;
y is corresponding dateAll the electric vehicle battery replacement requirements, namely labels of the data set, assume that there are n samples, each sample has m features, and then the final data set is T { (x)1,Y1),...,(xn,Yn)},
Figure BDA0002785076530000061
Where T is the data set, xiTaking 1, n, n as the number of samples for the ith sample;
Figure BDA0002785076530000062
taking 1, m and m as feature numbers for j which is the jth feature of the sample i; y isiTaking a label of a sample i, namely the battery replacement demand of the electric automobile, wherein i is 1, ·, n;
pre-processing of a data set, comprising: firstly adopts the formula
Figure BDA0002785076530000063
And (3) normalizing the data, then performing dimensionality reduction processing by adopting PCA, and reducing the dimensionality of the normalized data set from 13 features to 12 features.
The training set comprises a training subset and a verification set, the training subset is 70% of the total data volume, and the verification set and the test set are 15% of the total data volume.
Further, to facilitate data extraction, dates (year-month-day) may be added to the dataset, but not entered as features into the model.
Step 2, selecting k base learners, and training and predicting samples of a training set by each base learner in a cross validation mode; wherein the content of the first and second substances,
the number of the base learners is 5, and the base learners comprise K Nearest Neighbors (KNN), support vector machines (SVR), Gradient Boosting Regression Trees (GBRT), Random Forests (RF) and Ridge Regression (RR);
the cross validation adopts six-fold cross validation, and the assumption is thatr(. to) is the r-th base learner (r is 1,2,3,4,5), and the predicted knot of the 5 base learners in the training set can be obtained through a cross validation modeFruit fr(xi)。
Specifically, the method comprises the following steps:
adopting six-fold cross validation; averagely dividing the training set into 6 parts, namely T1, T2, T3, T4, T5 and T6, taking 5 parts as a training subset to train a base learner, and taking the other part as a verification set to predict by using the base learner; after multiple times of training and prediction, obtaining the prediction result of each base learner in the training set, and recording the result; all base learners are trained on the entire training subset.
Since the final integrated prediction method is integrated in conjunction with the base learner, increasing the prediction accuracy of the base learner necessarily increases the accuracy of the integrated prediction method. So all base learner hyperparameters adopt the default values in the sklern packet in python when training and predicting.
Step 3, selecting the best similar day training set of each input sample in the test set through grey correlation analysis;
wherein the content of the first and second substances,
selecting the R (i) most correlated with the ith sample in the training set using a gray correlation analysis1,i2,…,iR) Day, as training set T of similar daysi
Figure BDA0002785076530000071
In the formula (I), the compound is shown in the specification,
Figure BDA0002785076530000072
predicting the result of the sample i on the R day for the kth base learner;
the grey correlation analysis includes:
first, the gray correlation coefficient ([ xi ]) is calculated0i) And then calculates a gray correlation degree (gamma)0i) For an input test set sample x0And samples x in the training setiThe gray correlation coefficient calculation formula is as follows:
Figure BDA0002785076530000081
in which ξ0i(c) Sample x for the test set0And samples x in the training setiIn the grey correlation coefficient of the c-th feature,
Figure BDA0002785076530000082
Figure BDA0002785076530000083
after calculating the gray correlation coefficient of each feature of the input test set sample and all samples in the training set, the gray correlation degree of each sample in the input test set sample and training set needs to be calculated, and the formula is
Figure BDA0002785076530000084
Taking the average value of each gray correlation coefficient, wherein the larger the calculated gray correlation value is, the higher the correlation is; selecting the most relevant R samples from the training set as a similar day training set according to the calculated grey correlation degree; where R is typically 75% of the total training set samples.
Step 4, establishing a prediction deviation minimization optimization model according to the prediction result of each base learner in the training set of the optimal similar days, and adopting an L1 norm with a weight coefficient as a regular term;
wherein, the optimization model is as follows:
Figure BDA0002785076530000085
this formula can be equated to the following linear program:
Figure BDA0002785076530000086
s.t.α≥0
Figure BDA0002785076530000087
Figure BDA0002785076530000088
Figure BDA0002785076530000089
in the formula, E2]The method is used for solving expectation of a formula in brackets, and is used for solving an average value under a discrete condition; y is a random variable, i.e. the actual predicted value of a sample in the training set, YiIs the actual value of the ith sample; f ═ f1,...,fk) I.e. the predicted values of the k base predictors at the samples corresponding to Y, | α | respectively1=|α1|+...+|αkI, w is a weight coefficient, and k is the number of the base learners; z and viSolving the optimization model for the introduced intermediate variables can obtain a decision variable alpha ═ alpha (alpha)1,..,αk)。
Wherein w can search for the best R in the training set by adopting a cross validation mode, that is, the training set is averagely divided into 6 parts, five parts are selected as the training set, the other part is used as the validation set for prediction, each part can obtain an evaluation index, 6 evaluation indexes can be obtained because the part is divided into 6 parts, the average value of the 6 evaluation indexes is taken as the final prediction effect performance of the parameter at a certain value, and the optimal value of the evaluation indexes is found as the value obtained by the parameter during prediction by continuous cycle iteration, and the evaluation indexes in the example are preferably mean square error MAE and Symmetric Mean Absolute Percentage Error (SMAPE).
Step 5, solving the obtained weight coefficient of each base learner based on the optimization model to obtain an integrated predictor, and obtaining an integrated learning prediction result based on the integrated predictor; wherein the output of the integrated predictor is a linear weighted combination of the outputs of the basis learners, i.e. for a certain test set sample xiFinal prediction result F (x)i)=α1f1(xi)+...+αkfk(xi)。
The invention provides a prediction system for realizing the integrated learning prediction method, which comprises the following steps:
a construction module for implementing the step 1;
a training module for implementing the step 2;
the analysis module is used for realizing the step 3;
the establishing module is used for realizing the step 4;
and the prediction module is used for realizing the step 5.
Compared with the prior art, the invention has the beneficial effects that:
the invention considers the prediction accuracy and generalization of the integrated predictor, can effectively reduce the prediction deviation compared with the base learner with the best prediction effect, has better prediction effect on data with high random fluctuation, can better adapt to the data set acquired in practice, can be applied to other data sets, and has stronger practicability.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An integrated learning prediction method for an electric vehicle battery replacement demand based on grey correlation is characterized by comprising the following steps:
constructing a data set and preprocessing the data set, and dividing the preprocessed data set into a training set and a testing set;
selecting k base learners, and training and predicting samples of a training set by each base learner in a cross validation mode;
for each input sample in the test set, selecting the best similar day training set through grey correlation analysis;
establishing a prediction deviation minimization optimization model according to the prediction result of each base learner in the training set of the optimal similar days, and adopting an L1 norm with a weight coefficient as a regular term;
solving the obtained weight coefficient of each base learner based on the optimization model to obtain an integrated predictor, and obtaining an integrated learning prediction result based on the integrated predictor; wherein the output of the integrated predictor is a linear weighted combination of the outputs of the basis learners.
2. The ensemble learning prediction method of claim 1, wherein the data set is
Figure FDA0002785076520000011
Where T is the data set, xiTaking 1, … as the ith sample, wherein n is the number of samples;
Figure FDA0002785076520000012
j is the jth feature of the sample i, and 1, … and m are taken as feature numbers; y isiAnd i is a label of a sample i, namely the battery replacement demand of the electric vehicle, and is 1, …, n.
3. The ensemble learning prediction method of claim 2, wherein the features include:
x(1)for weeks, codes from 1 to 7;
x(2)if the number is weekend, the number is 1 if the number is weekend, otherwise, the number is 0;
x(3)the weather is divided into sunny days, cloudy days, rainy days or snowy days, and the weather is respectively coded as 1,2 and 3;
x(4)the highest air temperature value of the day;
x(5)the lowest air temperature value of the current day;
x(6)the battery replacement demand of all the electric automobiles on the same day of the last week;
x(7)predicting the battery replacement demand of all the electric vehicles on the day before the day;
x(8)predicting the driving mileage of all electric automobiles one day before the day;
x(9)~x(13)the remaining battery capacity (SOC) was in the interval [0, 20% ] when all vehicles finished traveling the day before the predicted day]、[20%,40%]、[40%,60%]、[60%,80%]、[80%,100%]The vehicle (c) accounts for the proportion of all electric vehicles.
4. The ensemble learning prediction method of claim 2, wherein the preprocessing of the data set comprises: standardizing data, and then performing dimensionality reduction by adopting PCA (principal component analysis);
the training set comprises a training subset and a verification set, wherein the training subset is 70% of the total data volume, and the verification set and the test set are 15% of the total data volume;
the basis learners include K-nearest neighbors (KNN), support vector machines (SVR), Gradient Boosting Regression Trees (GBRT), Random Forests (RF), and Ridge Regression (RR).
5. The ensemble learning prediction method of claim 4, wherein the training and predicting samples of the training set by each base learner using cross-validation comprises:
adopting six-fold cross validation;
averagely dividing the training set into 6 parts, namely T1, T2, T3, T4, T5 and T6, taking 5 parts as a training subset to train a base learner, and taking the other part as a verification set to predict by using the base learner;
after multiple times of training and prediction, obtaining the prediction result of each base learner in the training set; wherein f isr(. to) is the r-th base learner, r is 1,2,3,4,5, and the predicted result f of the 5 base learners in the training set can be obtained through a cross validation moder(xi);
All base learners are trained on the entire training subset.
6. The ensemble learning prediction method of claim 5, wherein the gray correlation analysis is used to select the sample most relevant to the ith sample in the training setR(i1,i2,…,iR) Day, as training set T of similar daysi
Figure FDA0002785076520000021
In the formula (I), the compound is shown in the specification,
Figure FDA0002785076520000022
the prediction result of the kth base learner for the R-th sample i.
7. The ensemble learning prediction method of claim 6, wherein the grey correlation analysis comprises:
first, the gray correlation coefficient ([ xi ]) is calculated0i) And then calculates a gray correlation degree (gamma)0i) For an input test set sample x0And samples x in the training setiThe gray correlation coefficient calculation formula is as follows:
Figure FDA0002785076520000023
in which ξ0i(c) Sample x for the test set0And samples x in the training setiIn the grey correlation coefficient of the c-th feature,
Figure FDA0002785076520000024
Figure FDA0002785076520000025
ρ∈[0,1]after calculating the gray correlation coefficients of each feature of the input test set samples and all samples in the training set, the gray correlation degree of each sample in the input test set samples and training set needs to be calculated, and the formula is
Figure FDA0002785076520000031
I.e. taking the average of each grey correlation coefficient,the larger the calculated grey correlation value is, the higher the correlation is; and selecting the most relevant R samples from the training set as a similar day training set according to the calculated gray correlation degree.
8. The ensemble learning prediction method of claim 7, wherein the optimization model is:
Figure FDA0002785076520000032
this formula can be equated to the following linear program:
Figure FDA0002785076520000033
s.t.α≥0
Figure FDA0002785076520000034
Figure FDA0002785076520000035
Figure FDA0002785076520000036
in the formula, E2]The method is used for solving expectation of a formula in brackets, and is used for solving an average value under a discrete condition; y is a random variable, i.e. the actual predicted value of a sample in the training set, YiIs the actual value of the ith sample; f ═ f1,...,fk) I.e. the predicted values of the k base predictors at the samples corresponding to Y, | α | respectively1=|α1|+...+|αkI, w is a weight coefficient, and k is the number of the base learners; z and viSolving the optimization model for the introduced intermediate variables can obtain a decision variable alpha ═ alpha (alpha)1,..,αk)。
9. The ensemble learning prediction method of claim 8, wherein sample x is taken for a test setiFinal prediction result F (x)i)=α1f1(xi)+...+αkfk(xi)。
10. A prediction system of the ensemble learning prediction method according to any one of claims 1 to 9, comprising:
the building module is used for building a data set, preprocessing the data set and dividing the preprocessed data set into a training set and a testing set;
the training module is used for selecting k base learners, and each base learner is used for training and predicting samples of a training set in a cross validation mode;
the analysis module is used for selecting the best similar day training set of each input sample in the test set through grey correlation analysis;
the building module is used for building a prediction deviation minimization optimization model according to the prediction result of each base learner in the training set of the optimal similar day, and an L1 norm with a weight coefficient is used as a regular term;
the prediction module is used for solving the obtained weight coefficient of each base learner based on the optimization model to obtain an integrated predictor and obtaining an integrated learning prediction result based on the integrated predictor; wherein the output of the integrated predictor is a linear weighted combination of the outputs of the basis learners.
CN202011294838.8A 2020-11-18 2020-11-18 Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand Active CN112258251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011294838.8A CN112258251B (en) 2020-11-18 2020-11-18 Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011294838.8A CN112258251B (en) 2020-11-18 2020-11-18 Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand

Publications (2)

Publication Number Publication Date
CN112258251A true CN112258251A (en) 2021-01-22
CN112258251B CN112258251B (en) 2022-12-27

Family

ID=74266200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011294838.8A Active CN112258251B (en) 2020-11-18 2020-11-18 Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand

Country Status (1)

Country Link
CN (1) CN112258251B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949948A (en) * 2021-04-28 2021-06-11 北京理工大学 Integrated learning method and system for electric vehicle power conversion demand interval prediction in time-sharing mode
CN113283174A (en) * 2021-06-09 2021-08-20 中国石油天然气股份有限公司 Reservoir productivity prediction method, system and terminal based on algorithm integration and self-control
CN113643758A (en) * 2021-09-22 2021-11-12 华南农业大学 Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter
CN115204417A (en) * 2022-09-13 2022-10-18 鱼快创领智能科技(南京)有限公司 Vehicle weight prediction method and system based on ensemble learning and storage medium
CN115829120A (en) * 2022-11-29 2023-03-21 中国环境科学研究院 Water quality prediction early warning system based on machine learning method
CN117236527A (en) * 2023-11-13 2023-12-15 宁德市天铭新能源汽车配件有限公司 Automobile part demand prediction method and system based on ensemble learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846517A (en) * 2018-06-12 2018-11-20 清华大学 A kind of probability short-term electric load prediction integrated approach of quantile
CN110414788A (en) * 2019-06-25 2019-11-05 国网上海市电力公司 A kind of power quality prediction technique based on similar day and improvement LSTM
CN110517494A (en) * 2019-09-03 2019-11-29 中国科学院自动化研究所 Forecasting traffic flow model, prediction technique, system, device based on integrated study
US20200151610A1 (en) * 2018-11-09 2020-05-14 Industrial Technology Research Institute Ensemble learning predicting method and system
CN111400180A (en) * 2020-03-13 2020-07-10 上海海事大学 Software defect prediction method based on feature set division and ensemble learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108846517A (en) * 2018-06-12 2018-11-20 清华大学 A kind of probability short-term electric load prediction integrated approach of quantile
US20200151610A1 (en) * 2018-11-09 2020-05-14 Industrial Technology Research Institute Ensemble learning predicting method and system
CN110414788A (en) * 2019-06-25 2019-11-05 国网上海市电力公司 A kind of power quality prediction technique based on similar day and improvement LSTM
CN110517494A (en) * 2019-09-03 2019-11-29 中国科学院自动化研究所 Forecasting traffic flow model, prediction technique, system, device based on integrated study
CN111400180A (en) * 2020-03-13 2020-07-10 上海海事大学 Software defect prediction method based on feature set division and ensemble learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘斌 等: "电动汽车充换电需求分析与预测", 《城市建设理论研究(电子版)》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949948A (en) * 2021-04-28 2021-06-11 北京理工大学 Integrated learning method and system for electric vehicle power conversion demand interval prediction in time-sharing mode
CN113283174A (en) * 2021-06-09 2021-08-20 中国石油天然气股份有限公司 Reservoir productivity prediction method, system and terminal based on algorithm integration and self-control
CN113643758A (en) * 2021-09-22 2021-11-12 华南农业大学 Prediction method for obtaining beta-lactam drug resistance resistant gene facing enterobacter
CN115204417A (en) * 2022-09-13 2022-10-18 鱼快创领智能科技(南京)有限公司 Vehicle weight prediction method and system based on ensemble learning and storage medium
CN115204417B (en) * 2022-09-13 2022-12-27 鱼快创领智能科技(南京)有限公司 Vehicle weight prediction method and system based on ensemble learning and storage medium
CN115829120A (en) * 2022-11-29 2023-03-21 中国环境科学研究院 Water quality prediction early warning system based on machine learning method
CN117236527A (en) * 2023-11-13 2023-12-15 宁德市天铭新能源汽车配件有限公司 Automobile part demand prediction method and system based on ensemble learning
CN117236527B (en) * 2023-11-13 2024-02-06 宁德市天铭新能源汽车配件有限公司 Automobile part demand prediction method and system based on ensemble learning

Also Published As

Publication number Publication date
CN112258251B (en) 2022-12-27

Similar Documents

Publication Publication Date Title
CN112258251B (en) Grey correlation-based integrated learning prediction method and system for electric vehicle battery replacement demand
CN111080032A (en) Load prediction method based on Transformer structure
CN110619419B (en) Passenger flow prediction method for urban rail transit
CN111861013B (en) Power load prediction method and device
CN111999649A (en) XGboost algorithm-based lithium battery residual life prediction method
US20220261655A1 (en) Real-time prediction method for engine emission
CN111476435A (en) Charging pile load prediction method based on density peak value
CN108415885A (en) The real-time bus passenger flow prediction technique returned based on neighbour
CN112949931B (en) Method and device for predicting charging station data by mixing data driving and models
CN115730635A (en) Electric vehicle load prediction method
CN111815026A (en) Multi-energy system load prediction method based on feature clustering
CN113298318A (en) Novel overload prediction method for distribution transformer
CN112036598A (en) Charging pile use information prediction method based on multi-information coupling
CN115358481A (en) Early warning and identification method, system and device for enterprise ex-situ migration
CN116523177A (en) Vehicle energy consumption prediction method and device integrating mechanism and deep learning model
CN113762591B (en) Short-term electric quantity prediction method and system based on GRU and multi-core SVM countermeasure learning
CN114580262A (en) Lithium ion battery health state estimation method
CN113033898A (en) Electrical load prediction method and system based on K-means clustering and BI-LSTM neural network
CN117113086A (en) Energy storage unit load prediction method, system, electronic equipment and medium
CN112949948B (en) Integrated learning method and system for electric vehicle power conversion demand interval prediction in time-sharing mode
CN116596129A (en) Electric vehicle charging station short-term load prediction model construction method
CN115511230A (en) Electric energy substitution potential analysis and prediction method
CN116644562A (en) New energy power station operation and maintenance cost evaluation system
CN110619422A (en) Intelligent station passenger flow condition prediction method and system
CN113837486B (en) RNN-RBM-based distribution network feeder long-term load prediction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant