CN106168799B

CN106168799B - A method of batteries of electric automobile predictive maintenance is carried out based on big data machine learning

Info

Publication number: CN106168799B
Application number: CN201610504843.4A
Authority: CN
Inventors: 常伟; 金樟平; 李舰
Original assignee: Individual
Current assignee: Nantong Le Chuangxin Energy Co ltd
Priority date: 2016-06-30
Filing date: 2016-06-30
Publication date: 2019-05-03
Anticipated expiration: 2036-06-30
Also published as: CN106168799A

Abstract

This patent is related to a kind of method for carrying out batteries of electric automobile life prediction maintenance based on big data machine learning, this method by corresponding application architecture, process, computation model form.This method is based on the battery real time data to the acquisition in batteries of electric automobile operational process, in conjunction with electric car vehicle others operation data, pass through the model training and proof of algorithm of machine learning, and the assessment of different angle is carried out to result, to establish the control strategy to batteries of electric automobile operation predictive maintenance and response, the maintenance and replacement for optimizing battery, improve the safety indexes of car owner, reach the balance of system performance and economic benefit.

Description

It is a kind of that batteries of electric automobile predictive maintenance is carried out based on big data machine learning Method

Technical field

The present invention relates to a kind of application analysis method that battery predictive maintenance is carried out based on big data machine learning, applications Field be batteries of electric automobile Maintenance forecast replacement and maintenance field.

Background technique

As electric car is in the popularization of China and the application of car networking technology, more and more electric cars, which enter, to disappear Expense person market and travelling data is acquired in real time.For the battery management system of one of electric car core component, still stop Stay in the stage judged by the threshold value of predefined.Maintenance management to battery is by inspecting periodically and based on event Method does not comprehensively consider the true driving situation of vehicle, personalized analysis is carried out for different driving behaviors, especially in electricity Before pond failure occurs, predictive behave can not be taken, affects vehicle maintenance expense in this way, and then to client's body of car owner It tests and adversely affects, electric car manufacturer leads to high cost of serving and product recall due to that can not identify later product problem.

At present to the management of batteries of electric automobile data, substantially dependence historical empirical data show that one is substantially repaired Time and life curve, what the battery management after factory substantially carried out on this basis.It is different due to driving situation complexity Vehicle condition and driving behavior all have a great impact to the performance of battery, and empirical data only has referential and can not effectively instruct true The maintenance of truth condition, at present lack a data-driven method system battery use is analyzed, thus obtain whether The indexs such as faulty and remaining power service life.

Summary of the invention

In order to solve this problem, it the present invention provides a kind of approaches of predictive maintenance of data-driven, is namely based on big The applied analysis system of batteries of electric automobile predictive maintenance is established in data machine learning.

The present invention provides a kind of batteries of electric automobile approaches of predictive maintenance, the method packets to solve the above-mentioned problems Include: step 001 data preparation step obtains and uses relevant data to batteries of electric automobile；The batteries of electric automobile uses Correlation includes the use data of breakdown maintenance data and battery；Wherein, before the breakdown maintenance data include cell malfunctions Data record and/or battery mantenance data；The use data of the battery are included in relevant to battery when normal use Battery itself data and vehicle condition data；The breakdown maintenance data, the use data of battery are all based on time series Stream data；Step 002 data preparation step, to the batteries of electric automobile using relevant data carry out cleaning and will be clear The batteries of electric automobile after washing is based on time quantum using relevant data and carries out data building；It is described that data are carried out clearly Wash bags include, and the assignment of vacant variable is carried out using the average value or median or neighbor interpolation that take a trip variable；Pass through Setting batteries of electric automobile checks that data will exceed normal model whether meet the requirement using the threshold value of each variable of related data The data enclosed are deleted or are corrected；The mutual constraint and dependence of related data are used by setting batteries of electric automobile, Data unreasonable or conflicting in logic are deleted or corrected；Data building includes, suitable according to the time Sequence integrates the other data collected；Step 003 data characterization step, the number that will be obtained by data preparation step According to summarizing and extracting, the data after characterizing are obtained；Summary and extraction for data include rolling polymerization, the rolling Polymerization refers to one time window of setting, calculates the polymerizing value in scheduled variable in the time window, the polymerizing value can To be summation, the average value either standard deviation of data；The summary and extraction further include being extended characteristic variable, described Extension includes to initial characteristic variable according to the corresponding number of mean value increase for rolling polymerization, and to initial characteristic variable Increase corresponding number according to the standard deviation for rolling polymerization；Step 004 establishes model step, is established based on the data after characterization Battery predictive safeguards adaptive model；The problem of for battery predictive maintenance, resolving into first subproblem is that battery is It is no will break down and second subproblem battery there are also how long can break down；For first subproblem be battery whether It will break down, and establish the battery predictive maintenance adaptive model in present embodiment using binary classification model； For second subproblem battery there are also that how long can break down, the battery predictive maintenance is established using regression model certainly Adaptive model；Step 005 trains verification step, is trained and verifies to adaptive model to optimize the adaptive model；Institute It states trained verification step and preferably includes cross validation, the cross validation includes that original data are randomly divided into K portion first Point, select one of part as test data in this K part, remaining K-1 part is obtained as training data Corresponding experimental result；Then, another part is selected as test data, and remaining K-1 part is used as training data； And so on, repeat K crosscheck, experiment all selects a different part as test from K part every time Data guarantee that the data of K part all did test data respectively, and remaining K-1 are tested as training data；Finally K obtained experimental result is averaged；Optimal data classification is determined based on the experimental result；Step 006 algorithm evaluation step Suddenly, prediction result of the assessment data under algorithms of different, optimal algorithm is selected based on assessment；The assessment is commented including accuracy Estimate, recall rate assessment or comprehensive evaluation index are assessed；The accuracy refers to the ratio that is consistent that prediction result actually really occurs Rate, accuracy assessment take algorithm corresponding to maximum numerical value；The recall rate refers to that how many really occurred is predicted just Really, accuracy assessment takes algorithm corresponding to maximum numerical value；Comprehensive evaluation indexWherein, α is meter Parameter is calculated, P is accuracy, R is recall rate, and the result F obtained according to algorithms of different judges different algorithms in different rings Superiority under border.

This method has determined the key problem of battery failures and remaining life in batteries of electric automobile management, for the core The acquisition and calibration of heart problem progress data and progress Data Integration and Feature Engineering, explicit data are defined and are carried out preliminary Processing carries out the definition of feature and label by predefined rule.It is finally to carry out model training and assessment, is led by data Enter, using the different models of machine learning, algorithms of different is selected to carry out matching verifying, and issued, become the production of structuring Product, and with accumulated time and data rich, the forecasting accuracy of model can be promoted constantly.

Detailed description of the invention

Fig. 1 is batteries of electric automobile predictive maintenance embodiment；

Fig. 2 is system structure diagram of the invention；

Fig. 3 is big data machine learning block diagram of the invention；

Fig. 4 is that polymerization schematic diagram is rolled in the present invention.

Specific embodiment

Specific implementation of the patent mode is described in detail with reference to the accompanying drawing, it should be pointed out that the specific reality Applying mode is only the citing to optimal technical scheme of the present invention, can not be interpreted as limiting the scope of the invention.

Fig. 1 shows the step of one of this patent specific embodiment batteries of electric automobile approaches of predictive maintenance.Wherein:

Step S001 data preparation step obtains and uses relevant data to batteries of electric automobile.

In this step, the data of the batteries of electric automobile include the use data of breakdown maintenance data and battery.Its In, the breakdown maintenance data include the mantenance data of the data record and/or battery before cell malfunctions.The battery Battery itself data relevant to battery and vehicle condition data when being included in normal use using data.

The stream data that time series is all based on using data of the breakdown maintenance data, battery, including but it is unlimited In voltage, electric current, remaining capacity (SOC) etc..A kind of citing but not all data content is as shown in the table.

S002 data preparation step, to the batteries of electric automobile using relevant data carry out cleaning and will be after cleaning The batteries of electric automobile is based on time quantum using relevant data and carries out data building.

In the present embodiment, due to being mainly based upon data processing realization, guarantee that the data of high quality are conducive to mention The accuracy of high result, it is therefore desirable to which data preparation is carried out to the data of acquisition.The data preparation first has to carry out data Cleaning, the present invention have formulated corresponding cleaning rule and have converted data of low quality to the data for meeting quality of data requirement. Cleaning rule includes:

Vacant assignment: battery data is in transmission process, it is easy to and occurring to exchange causes variable to lack, in the present invention, The main assignment that vacant variable is carried out using the average value or median or neighbor interpolation that take a trip variable.

Mistake value removal: by setting batteries of electric automobile using the reasonable value range of each variable of related data, i.e., Threshold value checks data whether meet the requirement, and the data that will exceed normal range (NR) are deleted or corrected.

Crosscheck: by setting batteries of electric automobile using the mutual constraint of related data and dependence, by logic Upper unreasonable or conflicting data are deleted or are corrected.

It cleans after data, data building is carried out based on time quantum, i.e., it is other by what is collected according to the sequence of time Data are integrated.Time quantum can be based on millisecond, second, minute etc., and time quantum can be different with the frequency of collection It causes.

After completing data building, need to be assessed and corrected to based on the data that time quantum is constructed.Institute Commentary is estimated including filtering out wrong data, i.e., there are those of mistake data for data itself.E.g., including but be not limited to, it lacks Value, exceptional value, time cycle mistake and calculating specification mistake etc..After evaluation, the wrong data is corrected.Example Such as missing values, the value that null will be present is set as 0, supplements the data of missing；For exceptional value, 0 is set by negative value, is kept away Exempt from occur mistake in training process；For the numerical value of time cycle mistake, the time cycle should be clearly obtained, adjusts and transports again Row data；For calculating the numerical value of specification mistake, bore adjustment and again operation data are specified.

The data obtained by data preparation step are summarized and are extracted by S003 data characterization step, obtain special Data after signization.

Due to needing to be handled data and calculated in subsequent processing step, for ease of calculation with identification data Feature, it is necessary first to reduced data is characterized in order to show the various features of the data consequently facilitating meter It calculates and identifies.

It in this step, include rolling polymerization for the summary of data and extraction.The rolling polymerization refers to setting one Time window, calculates the polymerizing value in scheduled variable in the time window, and the polymerizing value can be the summation of data, put down Mean value either standard deviation.As shown in figure 4, such as t1 node, setting time window are 3, its rolling polymerization is exactly to calculate t1 section Summation, mean value or the standard deviation of point and 3 nodes between the t1 node.

In this step, more preferable in order to provide learning algorithm, even additional study and predictive ability need More multivariate data, invention are summarized and are extracted from the battery data based on time series, thus by initial S001 Characteristic variable is extended.For example, when there is 65 characteristic variables in step S001, in this example, the number being extended According to mainly two classes: first is to increase 65-2=63 according to the mean value for rolling polymerization to initial 65 characteristic variables greatly；Second Class is to increase 65-2=63 according to the standard deviation for rolling polymerization to 65 initial characteristic variables；The change finally obtained in this way Amount is 65+63+63=191.This makes it possible to provide more multivariate data, so that being conducive to learning algorithm provides more preferable and prediction Ability.

S004 data calculate step, establish battery predictive maintenance adaptive model based on the data after characterization.

The problem of for battery predictive maintenance, two sub-problems can be resolved into, first subproblem be battery whether It will break down；Second subproblem is that there are also how long can break down for battery.Difference can be passed through for different problems Model and algorithm go to be predicted.

Whether will break down for battery, it is pre- to establish the battery in present embodiment using binary classification model The property surveyed maintenance adaptive model.

Specifically, the battery data of input is set as x；It is set as target judging whether battery will break down and is Y, then the individual of y is only there are two types of selection, y=1 as breaks down, and y=0 is to break down.

The model of so binary classification is: y=f (x), wherein f is specific algorithm, battery data x can be mapped to mesh It marks in y.

When being trained above-mentioned model using initial training data, need to carry out label to initial training data set, Using the data to break down as positive (label 1), using the data of normal operation as reversed (label 0), it is established that Next cycle possible breakdown or normal mode y=f (x), wherein y is whether battery will break down, and x is battery Data, f are specific algorithm.

Wherein, the specific algorithm f selectively includes: logistic regression, promotes decision tree, decision forest and nerve net Network.

The logistic regression algorithm assumes that the example of class is linear separability, passes through the gain of parameter of direct estimation discriminate Final prediction model.Consider vector x '=(x that there is P independent variable for the data of electric car predictive maintenance₁, x₂... x_p), if conditional probability P (Y=1 | x)=p is the probability occurred according to observed quantity relative to certain event.Logistic regression is collinear Property return equally required a hypothesis function, Sigmoid function is introduced in this algorithmWherein π (x) domain is (- ∞ ,+∞), and codomain is (0,1).According to defined above, formula used by the logistic regression algorithm Are as follows:

The promotion decision Tree algorithms are by combining the hierarchical data structure of decision tree divide-and-conquer strategy to initial classification The data weighting of last misclassification is improved every time and is a little classified again by generated classifying rules, such loop iteration Obtain objective result.

If D is the division that use classes carry out training tuple, then the entropy of D indicates are as follows:

Wherein, pi indicates what i-th of classification occurred in entire training tuple Probability can be used and belong to the quantity of this class elements divided by training tuple elements total quantity as estimating.The practical significance table of entropy Show it is average information required for the class label of tuple in D.For this prediction technique, D is battery failures situation, is had Failure and normal two states, so m=2.

If training tuple D is divided by attribute A, wherein A be after characterization, battery data one of them Feature, then the expectation information that A divides D are as follows:Wherein j indicates certain of attribute A A type, V indicate the classification sum of attribute A；And the information gain of attribute A is the difference of the two: gain (A)=info (D)- info_A(D).Need to calculate the information gain of each attribute in battery data training tuple at every secondary clearing (division), then The selection maximum attribute of ratio of profit increase is layered, and the decision tree for being able to carry out electric car predictive maintenance thus can be formed.

The forest that decision forest is made of multiple decision trees, algorithm classification result are voted to obtain by these decision trees, decision Tree adds random process on line direction and column direction respectively in the process of generation, uses when constructing decision tree on line direction Sampling with replacement (bootstraping) obtains training data, puts back to random sampling using nothing on column direction and obtains character subset, and Its optimal cut-off is obtained accordingly.Decision forest is a built-up pattern, and inside is still based on decision tree, with single decision Unlike tree classification, decision forest is classified by multiple decision tree voting results, and algorithm is not easy overfitting occur Problem.

Neural network is exactly that the second way of human brain thinking is simulated using its algorithm characteristic, it is a Nonlinear Dynamic Mechanical system is able to carry out concurrent collaborative processing although the structure of single neuron and its simple.It is different in neural network The output layer of scene corresponds to different cost functions, and in this method, output layer is K logistic regression, the cost letter of whole network Number is exactly the adduction of this K Logic Regression Models cost function, can carry out batteries of electric automobile failure by this cost function Prediction, the assessment of cost function carries out according to s006 algorithm evaluation.

For battery there are also that how long can break down, the battery is established in present embodiment using regression model Predictive maintenance adaptive model.

Regression model determines the relationship between variable to the credible of these relational expressions from one group of sample data Degree carries out various statistical checks, and the influence for finding out from all multivariables for influencing a certain particular variables which variable is significant, Which is not significant.

Time to break down marks each battery data from the time that time upper distance breaks down as Y Labelization；For example, when battery is used for 5 days, fault time is 300, remaining time represented by the label is 300-5=295；In another example when battery is used for 10 days, fault time is 280, remaining time represented by the label For 280-10=270.Sample each in this way can have a remaining pot life.Specific label is as shown in the table:

The battery data of input is set as x；The model of regression algorithm is Y=f (x).Wherein, the regression model is used Specific algorithm f include decision forest algorithm return, promoted decision tree return, Poisson regression and neural net regression.

Promoting decision tree and returning with decision forest recurrence is made of decision tree one or several decision trees, is The combination of decision tree and the battery whether will break down it is middle using the relevant algorithm of decision tree as, battery also In the regression model that how long can be broken down, also using judging to promote what decision tree and decision forest returned using information gain Quality passes through difference:

Gain (A)=info (D)-info_A(D), judge.

In Poisson regression, modeled using the Poisson regression model recorded extensively in the prior art.

Neural network is exactly a kind of algorithm for the simulation human brain thinking recorded extensively in the prior art.Neural network In, the output layer of different scenes corresponds to different cost functions.In this method, output layer can be K logistic regression, entire net The cost function of network is exactly the adduction of this K Logic Regression Models cost function.

S005 trains verification step, is trained and verifies to adaptive model to optimize the adaptive model.

On the basis of establishing above-mentioned model, the work for needing to be trained and verify carrys out Optimized model.In order to improve The accuracy of model.

In this embodiment, the trained verification step preferably includes cross validation and minority class sampling.

The cross validation method of parameter frame in to(for) each model optimizes.Such as disaggregated model above-mentioned (decision forest algorithm returns, promotes decision for (logistic regression promotes decision tree, decision forest and neural network) and regression model Tree algorithm returns, Poisson algorithm returns and neural network algorithm returns), the reliability of these algorithms relies on parameter frame, is exactly Say which battery data for generation the result is that most effective.

In this embodiment, in order to improve the quality of parameter frame, original data are randomly divided into K first Part.In this K part, select one of part as test data, remaining K-1 part is obtained as training data To corresponding experimental result.Then, another part is selected as test data, and remaining K-1 part is as training number According to, and so on, repeat K crosscheck.Every time experiment all selected from K part a different part as Test data guarantees that the data of K part all did test data respectively, and remaining K-1 are tested as training data. Finally K obtained experimental result is averaged, the experimental result may include accuracy, recall rate and comprehensive evaluation index Deng.According to the purpose of each predictive maintenance, in the selection of accuracy, recall rate and three kinds of comprehensive evaluation index of mean value, thus Determine optimal classification, the training of implementation model.

The minority class sampling is when only having small number of training sample for a kind of data, and data set is unbalanced It is used when situation.It, can be by will be a small number of in present embodiment when a kind of data only have a small amount of training sample The new minority class sample data of fault sample Data Synthesis carry out the training of model.Such as in the data collection of battery, Only discovery has a small amount of fault record data, in order to generate more data for carrying out machine learning from a small amount of fault data, It needs to carry out Data Synthesis.Specifically, a sample B is selected at random from its arest neighbors to each minority class sample A, this In distance be to be calculated according to the distance in time and variogram, then randomly choosed a bit on the line between A and B As newly synthesized minority class sample.Continuous synthesis in this way, a small amount of sample A can be become to have multidata Sample A+ will not be generated in calculating because of mistake caused by data nonbalance to reach the data demand of predictive maintenance Fitting or distortion.

S006 algorithm evaluation step is assessed prediction result of the data under algorithms of different, optimal calculation is selected based on assessment Method.

In the predictive maintenance of battery, based on different prediction targets or it is different data source, using different The obtained result of algorithm is also different, and thus needs to select preferable algorithm for different situations.

Usually in batteries of electric automobile predictive maintenance, accuracy (Precision), recall rate can be used (Recall) or comprehensive evaluation index (F1-Measure) carrys out assessment prediction as a result, more in varied situations using different Whether the obtained result of algorithm is optimal, to select optimal algorithm.

Wherein, accuracy is that how many is practical in the model prediction is broken down for prediction result sample The sample really to break down, usually the higher the better.The recall rate is how many quilt really to break down in sample Predict correct, usually the higher the better.

In battery predictive maintenance, the two usually conflicts.In order to improve for the reasonable of more excellent algorithms selection Property, F1-Measure comprehensive evaluation index is preferably used in this embodiment, it combines accuracy and recall rate Weighted average, the higher the better for value.Formula isWherein P is accuracy, and R is recall rate, when parameter alpha=1 When, it is exactly the most common F1, namelyThe result F or F1 obtained according to algorithms of different is different to judge The superiority of algorithm in different environments.Such as a certain group of specific data and prediction target, after calculating relatively It was found that such data and target select to promote decision Tree algorithms in disaggregated model and select neural network in regression model Regression algorithm result is optimal.

Claims

1. a kind of batteries of electric automobile approaches of predictive maintenance, which is characterized in that the described method includes:

Step 001 data preparation step obtains and uses relevant data to batteries of electric automobile；

The batteries of electric automobile uses the related use data including breakdown maintenance data and battery；Wherein, the failure dimension Repair the mantenance data that data include the data record and/or battery before cell malfunctions；The use data of the battery include Battery itself data relevant to battery and vehicle condition data in normal use；The breakdown maintenance data, battery The stream data of time series is all based on using data；

Step 002 data preparation step, to the batteries of electric automobile using relevant data carry out cleaning and will be after cleaning The batteries of electric automobile is based on time quantum using relevant data and carries out data building；It is described that cleaning packet is carried out to data It includes, the assignment of vacant variable is carried out using the average value or median or neighbor interpolation that take a trip variable；Pass through setting electricity Electrical automobile battery checks that data will exceed the number of normal range (NR) whether meet the requirement using the threshold value of each variable of related data According to being deleted or corrected；By setting batteries of electric automobile using the mutual constraint of related data and dependence, by logic Upper unreasonable or conflicting data are deleted or are corrected；The data building includes that will search according to the sequence of time The other data collected are integrated；

The data obtained by data preparation step are summarized and are extracted by step 003 data characterization step, obtain feature Data after change；Summary and extraction for data include rolling polymerization, and the rolling polymerization, which refers to, sets a time window, The polymerizing value in scheduled variable in the time window is calculated, the polymerizing value is the summation of data, average value either mark It is quasi- poor；It is described summary and extraction further include being extended characteristic variable, it is described extension include to initial characteristic variable according to It rolls the mean value polymerizeing and increases corresponding number, and initial characteristic variable is increased accordingly according to the standard deviation for rolling polymerization Number；

Step 004 establishes model step, establishes battery predictive maintenance adaptive model based on the data after characterization；For electricity The problem of pond predictive maintenance, resolving into first subproblem is whether battery will break down and second subproblem battery There are also how long can break down；Be whether battery will break down for first subproblem, using binary classification model come Establish the battery predictive maintenance adaptive model；For second subproblem battery there are also that how long can break down, use Regression model safeguards adaptive model to establish the battery predictive；

Step 005 trains verification step, is trained and verifies to adaptive model to optimize the adaptive model；The training Verification step includes cross validation, and the cross validation includes that original data are randomly divided into K part first, at this K Select one of part as test data in part, remaining K-1 part is tested accordingly as training data As a result；Then, another part is selected as test data, and remaining K-1 part is used as training data；And so on, Repeat K crosscheck, experiment all selects a different part as test data from K part every time, guarantees K The data of a part all did test data respectively, and remaining K-1 are tested as training data；Finally obtained K A experimental result is average；Optimal data classification is determined based on the experimental result；

Step 006 algorithm evaluation step is assessed prediction result of the data under algorithms of different, optimal calculation is selected based on assessment Method；The assessment includes accuracy assessment, recall rate assessment or comprehensive evaluation index assessment；The accuracy refers to prediction As a result the practical ratio that is consistent really occurred, accuracy assessment take algorithm corresponding to maximum numerical value；The recall rate refers to Really occur how many be predicted correctly, recall rate assessment take algorithm corresponding to maximum numerical value；Comprehensive evaluation indexWherein, α is calculating parameter, and P is accuracy, and R is recall rate, according to the result F that algorithms of different obtains come Judge the superiority of different algorithms in different environments.

2. a kind of batteries of electric automobile approaches of predictive maintenance according to claim 1, which is characterized in that complete data structure After building, is assessed and corrected to based on the data that time quantum is constructed；The assessment includes garbled data itself There are those of mistake data；After evaluation, the wrong data is corrected；The correction includes: for missing Value, sets 0 for missing values；For exceptional value, 0 is set by negative value；For the numerical value of time cycle mistake, should clearly take Obtain time cycle, adjustment and again operation data；For calculating the numerical value of specification mistake, specifies bore and adjust and rerun number According to.

3. a kind of batteries of electric automobile approaches of predictive maintenance described in any one of -2 according to claim 1, which is characterized in that The binary classification model includes: that the battery data of input is set as x；It is set as mesh judging whether battery will break down It is designated as y, y=1, is as broken down, y=0 is to break down；The model of binary classification is: y=f (x), and wherein f is tool Body algorithm；The specific algorithm includes: logistic regression, promotes decision tree, decision forest and neural network.

4. according to a kind of batteries of electric automobile approaches of predictive maintenance described in claim 3, which is characterized in that in the recurrence In model, the time to break down carries out label from the time that time upper distance breaks down to each battery data as Y Change；The battery data of input is set as x；The model of regression algorithm is Y=f (x), and wherein f is specific algorithm, comprising: decision is gloomy Woods returns, promotes decision tree recurrence, Poisson regression and neural net regression.

5. a kind of batteries of electric automobile approaches of predictive maintenance according to claim 4, which is characterized in that the step It further include that minority class sampling is trained the model in 005, when data a kind of in sample only have a small amount of training sample When, by the training that the new minority class sample data of a small number of fault sample Data Synthesis is carried out to model；To each minority Class sample A selects a sample B from it at random in arest neighbors, and the distance is according to the distance in time and variogram It is calculated, then random selection is a little used as newly synthesized minority class sample on the line between A and B；By continuous Synthesis, by a small amount of sample A, becomes have multidata sample A+.