Electric vehicle charging behavior characteristic analysis method and system
Technical Field
The invention belongs to the technical field of electric vehicle charging behavior characteristic analysis, and relates to an electric vehicle charging behavior characteristic analysis method.
Background
The electric automobile is used as a low-carbon and clean vehicle, more and more users select the electric automobile as a travel tool, and the charging demand of the electric automobile is greatly increased. The electric automobile charging station coverage area constantly expands, and the charging station user is also more and more. The behavior characteristics of the charging station user or the electric vehicle user are correctly identified, and the method has important significance for improving operation efficiency, improving operation and maintenance service quality and guiding charging behaviors of the user of an operation enterprise.
The charging load prediction of the electric vehicle, the formulation of an ordered charging guide strategy of the electric vehicle, the adjustment of charging service fees, the location and the volume of a charging station and the like are all premised on accurately grasping the charging behavior and the charging requirement of the electric vehicle. With the wide accumulation of data such as charging metering, trip survey statistics and the like, the data driving method which does not depend on model parameter hypothesis can reflect the charging behavior characteristics of the electric automobile more truly. The research on the behavior characteristics of the charging user mainly depends on charging transaction data, and the data only relates to the charging card number, the charging electric quantity, the transaction amount, the transaction mode, the transaction time and the transaction stake number of the user generally, and lacks accurate description on the user. How to capture user features from existing data depends on the choice of the correct study method.
The trip and the charging activities of the electric automobile have strong randomness, and the previous research method for the charging behavior of the electric automobile comprises the following steps:
1. the parameter estimation method, i.e. the conventional modeling method relying on empirical assumption, generally considers the charging behavior characteristics of the electric vehicle as a certain mathematical distribution, for example, considers the initial charging time as a normal distribution, and assumes the daily mileage of the electric vehicle as a lognormal distribution, so as to predict the charging load of the electric vehicle. Since the parameter estimation method is only applicable when the probability density of a certain feature conforms to a specific mathematical distribution, the parameter estimation method cannot obtain an accurate probability density function when the analyzed behavior feature does not conform to a specific mathematical distribution or when the analyzed behavior feature is a superposition of a plurality of mathematical distributions.
2. And analyzing the charging behavior characteristics of the electric automobile by adopting a nonparametric estimation method in a frequency distribution histogram mode. The frequency distribution histogram is simple and easy to calculate, but when the frequency distribution histogram is drawn, the group distance needs to be determined, if the group distances are different, the final frequency distribution histogram can generate a large difference, and the probability of each point can not be expressed continuously.
3. The non-parameter estimation method adopting kernel density estimation is simple in bandwidth calculation method, but when the bandwidth is calculated on actual data in a superposition form of a plurality of standard distributions by using a traditional bandwidth empirical formula, especially under the condition that the actual data quantity is not very large, the obtained probability density function is large in error.
Disclosure of Invention
In order to overcome the defects in the prior art, the application provides an electric vehicle charging behavior characteristic analysis method and system, and the probability density of the electric vehicle charging behavior characteristic is obtained by utilizing the historical record of electric vehicle charging transaction data and an improved kernel density estimation method.
In order to achieve the above purpose, the invention adopts the following technical scheme:
an electric vehicle charging behavior characteristic analysis method comprises the following steps:
step 1: acquiring historical charging data of a charging pile, and screening charging behavior characteristic data of the electric automobile;
step 2: performing data conversion on the characteristic data obtained by screening in the step 1;
and step 3: carrying out data cleaning on the feature data after data conversion;
and 4, step 4: and calculating a charging behavior probability density function based on the characteristic data after data cleaning.
The invention further comprises the following preferred embodiments:
preferably, in step 1, the historical charging data of the charging pile is subjected to electric vehicle charging behavior characteristic data screening, and the electric vehicle charging behavior characteristics include: the charging system comprises a user identification number, a charging station number, a charging pile number, charging electric quantity, electric charge service fee, charging starting time, charging ending reason, charging date, week, whether double-break days, whether holidays are saved, weather types and average air temperature.
Preferably, the reason for terminating charging is: the method comprises the following steps of full charge of the electric automobile, termination of a user, illegal gun drawing, exhausted pre-charge amount, data verification error, communication fault, charging equipment fault and abnormal power failure.
Preferably, step 2 comprises the steps of:
step 2.1: unifying data formats, and unifying the data formats of different types of feature data;
step 2.2: analyzing the classified data, and converting the classified items of the feature data into numerical data;
step 2.3: derived feature data of the base feature data is calculated.
Preferably, step 3 comprises the steps of:
step 3.1: filtering null data, error data and repeated data;
step 3.2: filtering data of which the charging termination reasons are pre-charging amount exhausted, data verification errors, communication faults, charging equipment faults or abnormal power failure;
step 3.3: performing further abnormal data detection and filtering based on a density clustering algorithm;
step 3.4: and filtering out data which does not need to be analyzed according to actual requirements.
Preferably, step 3.3 comprises:
step 3.3.1: dividing the feature data into a plurality of subsets;
step 3.3.2: and (3) carrying out abnormal value detection on each subset by using a density-based clustering algorithm, taking the value which does not strongly belong to any clustering cluster as an abnormal point, and filtering abnormal data.
Preferably, in step 3.3.1, the generation of subsets follows two principles:
principle 1: selecting the features with the correlation degree larger than a threshold value in the feature data set through a correlation analysis method, and respectively using the features as a subset;
principle 2: and generating the subset according to the determined relation and meaning.
Preferably, in step 4, the charging behaviors of the electric vehicle users are classified according to different characteristics, and a probability density function of the charging behavior characteristics of each class of users is calculated, specifically, the probability density function of the charging behaviors of the electric vehicle users is calculated by a gaussian kernel function method, and the method includes:
step 4.1: calculating the number N of probability density function peaks from one-dimensional data of a certain characteristicp;
Step 4.2: the optimal bandwidth is calculated according to the following formula:
hopt=(1.059σn-0.2)/Np
wherein: sigma is the standard deviation of the sample data, and n is the number of the sample data;
step 4.3: calculating a probability density function:
wherein: k (-) is a kernel function, a Gaussian function is selected as the kernel function, namely:
x represents random sample data, XiTo represent the ith known sample data.
Preferably, in step 4.1, NpThe specific calculation method comprises the following steps:
selecting a group distance with a proper small distance to group original data and counting frequency;
counting the number of peak values as N through a frequency distribution histogramp;
Or low-pass filtering the frequency data and extracting peak values from the filtered data, wherein the number of the peak values is Np。
The application also discloses an electric vehicle charging behavior characteristic analysis system according to the electric vehicle charging behavior characteristic analysis method, the system comprises:
the characteristic data acquisition module is used for acquiring historical charging data of the charging pile and screening the charging behavior characteristic data of the electric vehicle;
the data conversion module is used for carrying out data conversion on the characteristic data obtained by screening;
the data cleaning module is used for cleaning the data of the feature data after the data conversion;
and the calculating module is used for calculating a charging behavior probability density function based on the characteristic data after data cleaning.
The beneficial effect that this application reached:
1. compared with a frequency distribution histogram, the probability density function can more accurately describe the charging behavior characteristics of the electric automobile, the invention also provides a bandwidth selection method suitable for analyzing the charging behavior of the electric automobile and solving the probability density function, and in practical application, the obtained probability density function has high result precision and good stability;
2. compared with the traditional bandwidth calculation formula, the bandwidth calculation method of the kernel density function provided by the invention has the advantages that: when data formed by mutually overlapping a plurality of distributions is processed, the probability density function obtained by the method is more accurate and has higher precision.
3. Compared with the traditional method for detecting the abnormal data by using all the characteristics to cluster and the like, the method for detecting the abnormal data by using the molecular set has the advantages that the method for detecting the abnormal data by using the molecular set can detect the abnormal data more accurately because certain correlation exists between the characteristics of each subset or certain significance exists.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a flow chart of a method for calculating a probability density function of a charging behavior according to an embodiment of the present invention;
FIG. 3 is a probability density diagram of charging duration in accordance with an embodiment of the present invention.
Detailed Description
The present application is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present application is not limited thereby.
As shown in fig. 1 and 2, the method for analyzing charging behavior characteristics of an electric vehicle according to the present invention includes the following steps:
step 1: acquiring historical charging data of a charging pile, and screening charging behavior characteristic data of the electric automobile;
carry out electric automobile behavior characteristic data screening that charges to the historical charging data of filling electric pile, electric automobile behavior characteristic that charges includes: the charging system comprises a user identification number, a charging station number, a charging pile number, charging electric quantity, electric charge service fee, charging starting time, charging ending reason, charging date, week, whether double-break days, whether holidays are saved, weather types and average air temperature.
The reason for the termination of charging is: the method comprises the following steps of full charging of the electric automobile, termination of a user, illegal gun drawing, exhausted pre-charging amount, data verification error, communication fault, charging equipment fault and abnormal power failure;
step 2: and (3) performing data conversion on the feature data obtained by screening in the step (1), wherein the data conversion method comprises the following steps:
step 2.1: unifying data formats, and unifying the data formats of different types of feature data;
if weather information is acquired, data in a Json format is returned after the data interface is accessed; when the charging pile transaction data are acquired, XML format data or binary data can be returned. The data from different sources need to be analyzed and converted according to different formats, and the data is stored in a uniform format so as to be processed in the next step.
Step 2.2: analyzing the classified data, and converting the classified items of the feature data into numerical data, such as encoding non-numerical data of fault types, weather types, festivals and holidays and the like;
step 2.3: derived feature data of the base feature data is calculated. For example, the charging time period is calculated by using the charging start time and the charging end time.
And step 3: the method is characterized in that the characteristic data after data conversion is subjected to data cleaning, and the original data is mainly subjected to purification treatment, so that the reliability of the original data is improved, and the data analysis result is accurate as much as possible, and the method comprises the following steps:
step 3.1: filtering null data, error data and repeated data;
step 3.2: filtering data of which the charging termination reasons are pre-charging amount exhausted, data verification errors, communication faults, charging equipment faults or abnormal power failure;
step 3.3: performing further abnormal data detection and filtering based on a density clustering algorithm; the method comprises the following steps:
step 3.3.1: dividing the feature data into a plurality of subsets;
the generation of subsets follows two principles:
principle 1: selecting the features with the correlation degree larger than a threshold value in the feature data set through a correlation analysis method, and respectively using the features as a subset, for example, generating a correlation subset by adopting a Pearson correlation coefficient method;
principle 2: generating subsets according to the determined relation and meaning, wherein the charging time, the charging quantity and the weather characteristics can be used as one subset;
the same feature data may be present in multiple subsets.
Step 3.3.2: and (3) carrying out abnormal value detection on each subset by using a density-based clustering algorithm, taking the value which does not strongly belong to any clustering cluster as an abnormal point, and filtering abnormal data.
For example: the raw data of 4 characteristics of "charging duration, charging start time, charging electric quantity, electric charge service fee" are analyzed by using a correlation coefficient method, and the obtained result is a correlation coefficient matrix shown in table 1:
TABLE 1 correlation coefficient matrix of charging duration, charging start time, charging quantity, and electric charge service fee
|
Duration of charging
|
Starting time
|
Amount of charge
|
Electricity fee service fee
|
Duration of charging
|
1
|
-0.153
|
0.327
|
0.387
|
Starting time
|
|
1
|
-0.161
|
-0.169
|
Amount of charge
|
|
|
1
|
0.995
|
Electricity fee service fee
|
|
|
|
1 |
Through the data in table 1, two characteristics of high correlation degree of "charging capacity and electric charge service fee" are selected as a subset, and meanwhile, because the charging time is considered to stay at the charging pile position for a long time after some electric vehicles are fully charged, the charging time is also taken into the subset. And carrying out abnormal value detection based on a density clustering algorithm on the subsets, and filtering abnormal data.
Step 3.4: filtering out data which do not need to be analyzed according to actual requirements, such as: and filtering records of which the charging time is less than 3 minutes, wherein the numerical value is obtained by analyzing actual data of the charging time, most records of which the charging time is less than 3 minutes are invalid data or the analysis value of the charging behavior of the electric automobile is not large.
And 4, step 4: and calculating a charging behavior probability density function based on the characteristic data after data cleaning.
Classifying the charging behaviors of the electric vehicle users according to different characteristics, and calculating a probability density function of the charging behavior characteristics of each category of users, specifically, calculating the probability density function of the charging behaviors of the electric vehicle users by a Gaussian kernel function method, wherein the method comprises the following steps:
step 4.1: calculating the number N of probability density function peaks from one-dimensional data of a certain characteristicp;
NpThe specific calculation method comprises the following steps:
selecting a group distance with a proper small distance to group original data and counting frequency;
counting the number of peak values as N through a frequency distribution histogramp;
Or low-pass filtering the frequency data and extracting peak values from the filtered data, wherein the number of the peak values is Np。
Step 4.2: the optimal bandwidth is calculated according to the following formula:
hopt=(1.059σn-0.2)/Np
wherein: sigma is the standard deviation of the sample data, and n is the number of the sample data;
step 4.3: calculating a probability density function:
wherein: k (-) is a kernel function, a Gaussian function is selected as the kernel function, namely:
x represents random sample data, XiTo represent the ith known sample data.
Fig. 3 shows a charging duration probability density graph obtained by applying the above steps, in fig. 3, an area part is a frequency distribution histogram of the charging durations of the electric vehicle counted in groups according to 1 minute, a dotted line is an electric vehicle charging duration probability density function curve, an abscissa in fig. 3 is the charging duration, a left Y axis is an ordinate of the frequency distribution histogram and represents the charging frequency, and a right Y axis is an ordinate of the probability density curve and represents a probability value at the point. Fig. 3 shows the charging behavior of the electric vehicle through a probability density function.
An electric vehicle charging behavior feature analysis system, the system comprising:
the characteristic data acquisition module is used for acquiring historical charging data of the charging pile and screening the charging behavior characteristic data of the electric vehicle;
the data conversion module is used for carrying out data conversion on the characteristic data obtained by screening;
the data cleaning module is used for cleaning the data of the feature data after the data conversion;
and the calculating module is used for calculating a charging behavior probability density function based on the characteristic data after data cleaning.
The present applicant has described and illustrated embodiments of the present invention in detail with reference to the accompanying drawings, but it should be understood by those skilled in the art that the above embodiments are merely preferred embodiments of the present invention, and the detailed description is only for the purpose of helping the reader to better understand the spirit of the present invention, and not for limiting the scope of the present invention, and on the contrary, any improvement or modification made based on the spirit of the present invention should fall within the scope of the present invention.