CN106845526A

CN106845526A - A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering

Info

Publication number: CN106845526A
Application number: CN201611247433.2A
Authority: CN
Inventors: 董云帆; 房红征; 樊焕贞; 高健; 熊毅; 李蕊
Original assignee: Beijing Aerospace Measurement and Control Technology Co Ltd
Current assignee: Beijing Aerospace Measurement and Control Technology Co Ltd
Priority date: 2016-12-29
Filing date: 2016-12-29
Publication date: 2017-06-13
Anticipated expiration: 2036-12-29
Also published as: CN106845526B

Abstract

A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering that the present invention is provided, Fault Classification of the invention is from the mass data of equipment operation, fault data is selected according to diagnostic rule, and carry out the machine of supervision and independently cluster, form the automatic classification results of relevant parameter failure, can solve the problem that current equipment failure overdiagnose relies on expert knowledge library, and have ignored the problem of incidence relation between each subsystem between the parameter of depth Non-linear coupling, and magnanimity valid data there is no the problem of good digging utilization in actual equipment model operation；Simultaneously, because the implementation of Fault Classification of the invention need not rely upon the precise physical modeling to object equipment, therefore the difficulty that traditional complication system is difficult to model is avoided, the intelligent fault classification and relevant parameter analysis excavated based on mass data are realized, with the controllable failure modes ability of accuracy rate.

Description

A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering

Technical field

The present invention relates to equipment failure prediction and health control (PHM) field, and in particular to a kind of based on big data fusion The relevant parameter Fault Classification of cluster analysis.

Background technology

Failure predication and health control have been developed as aerospace field system logistics support, safeguard and autonomous health The important support technology of management and basis, in " National Program for Medium-to Long-term Scientific and Technological Development 2006-2020 ", " weight Big product and great installation forecasting technique in life span " proposes to be sent out in space flight in recent years, Aeronautics subject as cutting edge technology In exhibition report, PHM technologies are classified as crucial and support technology.

PHM technologies have become one and cover basic material, mechanical structure, the energy, electronics, automatic test, reliability, letter The multi-field cross discipline such as breath and research hot topic direction, with important application value and realistic meaning.In most work In industry system PHM applications, the mathematics or physical model for setting up complex component or system are very difficult or even cannot realize, or identification The parameter of model is complex, therefore, the test data in each stage such as part or system design, emulation, operation and maintenance, biography Sensor historical data just turns into grasps the Main Means that systematic function declines.

Thus, the PHM methods based on test or sensor historic data mining are gradually paid attention to and are obtained fast development, Important research focus as PHM fields.Especially for complication systems such as Aero-Space, it is difficult to direct access or structure are characterized The physical model of part, system degradation and residual life, meanwhile, these objective systems and part possess a large amount of available state prisons Survey and test data, therefore, the PHM method systems based on data-driven obtain U.S. army, NASA and numerous grind Study carefully the extensive attention of mechanism, industrial enterprise.

Data-driven PHM methods are to be gathered and obtained the feature ginseng relevant with system property based on advanced sensor technology Number, and these characteristic parameters are associated with useful information, detected, analyzed and predicted by intelligent algorithm and model, be given The probability of the residual life distribution, performance degradation degree or task failure of goal systems, so as to be maintenance system and system security Decision information is provided.

In the middle of data-driven PHM method systems, method flow, distinct methods fusion, model selection, Model suitability etc. Problem has become the research emphasis in the field now, and data-driven PHM methods are obtained with its flexible adaptability and ease for use Obtained and be widely applied and promote.

The content of the invention

It is an object of the present invention to there is fault data acquisition difficulty to solve existing data-driven PHM methods Technical problem, the present invention provides a kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering, for improving Existing complex equipment contains the present situation that the service data of magnanimity information is not excavated effectively, effectively utilized.

To achieve the above object, the invention provides the algorithm flow of complete set, computing and parsing are carried out, is drawn final Failure modes and parameter association probabilistic model.Described relevant parameter Fault Classification includes：

Step 1) obtain the various service datas that object is equipped.

Step 2) according to the design data of object equipment, the parameter diagnostic rule of the whole parameters of object equipment is covered in foundation Storehouse.Parameter diagnostic rule storehouse not only includes the thresholding judgment rule of parameter, while the also Trend judgement rule comprising parameter and jump Become judgment rule.

Step 3) be defined by the rule in parameter diagnostic rule storehouse, to step 1) in the screening of all of service data obtain failure Data, all fault data set are formed without classification fault data collection.

Step 4) independently clustered by the data that clustering algorithm carries out supervision with without classification fault data collection, met It is required that clusters number and every cluster centre.The number of cluster gradually increases to appropriate number since 2, and final selection makes The minimum value that each cluster core average weighted distance no longer reduces is the sum of cluster.Meanwhile, by the every cluster centre for determining Classification fault data collection is obtained to carrying out classification without classification fault data collection.

Step 5) by step 3) in use mapping-reduction algorithm, generation parameter association probability without classification fault data collection Model, while described parameter association probabilistic model breaks down comprising every parameter in object equipment, other specification is also sent out The probability distribution data of raw failure, data are arranged in probability form from high to low.

Step 6) using step 4) in classification fault data collection as fault distinguishing standard, using nearest neighbor algorithm, to step Rapid 1) the middle service data for obtaining carries out fault category identification, obtains failure modes result.

Step 7) according to failure modes result with step 5) in parameter association probabilistic model be combined, draw comprehensive event Barrier diagnostic classification result.The content that comprehensive diagnos classification results are included is：Failure modes result, the failure modes result it is all The probability distribution data of parameter.

As the further improvement of above-mentioned technical proposal, the step 1) in obtain service data form meet：Often Moment and all parameter values in moment object equipment that individual complete Data Entry occurs comprising the Data Entry；Each A measured value for parameter in the object-based device at the single data values sign a certain moment in Data Entry；Between each Data Entry Sequencing according to there is the moment is arranged one by one.

As the further improvement of above-mentioned technical proposal, the step 3) in screening fault data form meet：Often Moment of the individual Data Entry comprising Data Entry generation and the whole fault parameters broken down at the moment；For data The parameter broken down in entry, marks failure and triggered rule occurs according to parameter diagnostic rule storehouse.

As the further improvement of above-mentioned technical proposal, described bound of the parameter diagnostic rule storehouse comprising parameter, ginseng Number saltus step unusual determination rule, parameter trend gradual change unusual determination rule.

As the further improvement of above-mentioned technical proposal, described step 4) specifically include：

Step 101) initial number that sets clusters number K is 2, according to current K values to without classification fault data collection Cluster computing is carried out, K cluster centre and its corresponding K cluster is obtained；

Step 102) the K mean profile coefficient of cluster is calculated, by the K mean profile coefficient of cluster and K-1 cluster Mean profile coefficient compare, if two mean profile coefficients are constant, choose current K values as cluster sum, otherwise Step 101 is re-executed after setting K=K+1)；Described silhouette coefficient represents all Data Entries correspondence included in each cluster Vector point to the geometric distance of cluster centre average value；

Step 103) with step 102) the middle cluster for determining is total to carrying out cluster computing without classification fault data collection, and lead to Cross the every cluster centre for obtaining to classify all fault datas concentrated without classification fault data, obtain classification number of faults According to collection.

As the further improvement of above-mentioned technical proposal, described step 101) the middle operating procedure bag for obtaining cluster centre Include：

Step 101-1) random from all service datas of object equipment select the corresponding vector point of Data Entry As first cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as in the second cluster The heart；

Step 101-2) geometric distance Distance (x) of each cluster centre cluster centre nearest with it is calculated, will All geometric distances Distance (x) are added and obtain always apart from Sum (Distance (x))；

Step 101-3) randomly select one can fall it is corresponding in the Data Entry always in Sum (Distance (x)) Vector point Random, as the cluster centre for newly increasing, re-executes step 101-2), until in picking out K cluster The heart.

As the further improvement of above-mentioned technical proposal, described step 5) specifically include：

Step 201) the whole fault data entries comprising each bar parameter are respectively mapped to together, form each bar ginseng successively The frequency of the corresponding mapping classes of number, the whole fault data entries of described mapping class comprising parameter and its appearance；

Step 202) sum of fault data entry in each mapping class is calculated, as the denominator of probability calculation；

Step 203) add up in each mapping class comprising the secondary of the other specification appearance in addition to mapping class correspondence parameter Number, as the molecule of probability calculation；

Step 204) by step 203) in molecule and step 202) in the ratio between denominator, obtain each parameter and break down While, the probability distribution data that other specification also breaks down.

As the further improvement of above-mentioned technical proposal, described step 6) specifically include：Calculation procedure 1) in it is all Service data and the geometric distance of the every cluster centre for determining, take the distance value of minimum and the mean profile system of corresponding cluster Number is compared, if the distance value is less than the mean profile coefficient of corresponding cluster, judges that service data is the cluster institute Corresponding fault type.

It is of the invention it is a kind of based on big data Fusion of Clustering analysis relevant parameter Fault Classification advantage be：

The invention provides it is a kind of define clearly, it is actual it is exercisable, with good result based on mass data The relevant parameter Fault Classification of Fusion of Clustering analysis, following some technology for improving existing method for diagnosing faults presence is asked Topic：

1. current equipment failure overdiagnose relies on expert knowledge library, and expert knowledge library is when in face of complication system, face Face multiple shot array problem, it is difficult to the whole failure situations of covering and its relevant parameter, have ignored the ginseng of depth coupling between each subsystem The problem of the non-linear correlation relation between number.In this regard, Fault Classification of the invention is by data mining means, excavate not With parameter association relation and its fault mode between subsystem such that it is able to be effectively improved above mentioned problem.

2. available data drives PHM methods office to be only limitted to component-level fault diagnosis, and in the fault diagnosis of complication system level During, due to there is the difficulty to complication system entirety Accurate Model, for the variety classes event being mingled in normal data Barrier data rely primarily on the machine learning clustering method of non-supervisory formula, and the result of cluster both includes normal data, also comprising failure Data, and fault data classification is not good.Thus it is currently based on the method for diagnosing faults of data-driven, although examine in component-level Preferable effect is achieved in disconnected, but in complication system level diagnosis, it is difficult to obtain better than the fault diagnosis based on model-driven Method.In this regard, Fault Classification of the invention has merged the advantage of data-driven method and model driven method, using existing The expert knowledge library based on model, the classification that has supervision (sentence read result supervision) is carried out to equipment service data, greatly improve The classification and convergence of data, can improve the not good problem of the classifying quality of current data-driven PHM methods.

Brief description of the drawings

Fig. 1 is a kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering in the embodiment of the present invention Overview flow chart.

Fig. 2 a- Fig. 2 d are four repetition Test Drawings of selection cluster sum execution in the embodiment of the present invention.

Fig. 3 is the operational flowchart of clustering algorithm in the embodiment of the present invention.

Fig. 4 is the parameter association probabilistic algorithm figure based on mapping-reduction algorithm in the embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawings and examples to a kind of association ginseng based on the analysis of big data Fusion of Clustering of the present invention Number Fault Classification is described in detail.

Expert knowledge library is relied in order to solve current equipment failure overdiagnose, and expert knowledge library is difficult to cover each subsystem The problem of the non-linear correlation relation between system between the parameter of depth coupling, and be in complexity using available data driving method Effect on driving birds is not good in system fault diagnosis, the present situation that mass data is not excavated effectively, the invention provides one kind definition clearly, in fact Border is exercisable, with good result, relevant parameter Fault Classification based on the analysis of mass data Fusion of Clustering.

In the present embodiment, the relevant parameter failure modes side based on the analysis of big data Fusion of Clustering that the present invention is provided Method, uses certain equipment power-supply system for example is verified.Set up by data prediction, rule, fault data is screened, poly- The processes such as class, mapping, stipulations, form comprehensive failure modes result.

The data sources such as real-time running data and direct fault location data first according to equipment, set up equipment service data Collection, for the model training based on data-driven and checking.Secondly equipped according to object, set up equipment parameter diagnostic rule storehouse, Interpretation and detection are carried out for the real time fail to parameter in equipment operation.Then according to diagnostic rule storehouse, equipment was run The mass data of journey carries out interpretation, therefrom isolates the Data Entry containing fault parameter.After fault data is isolated, use The autonomous clustering method of machine learning for having supervision carries out failure mode cluster.Fault verification is carried out using the cluster of generation, while Generation error parameter matrix, and Parameter analysis are associated using mapping-stipulations (Map-Reduce) method, form analysis knot Really.It can thus be appreciated that：Fault Classification of the invention selects number of faults from the mass data of equipment operation according to diagnostic rule According to, and carry out the machine of supervision and independently cluster, form the automatic classification results of relevant parameter failure, can solve the problem that equipment at present therefore Barrier overdiagnose relies on expert knowledge library, and have ignored the incidence relation between each subsystem between the parameter of depth Non-linear coupling Problem, and magnanimity valid data there is no the problem of good digging utilization in actual equipment model operation；Meanwhile, by The precise physical modeling to object equipment is need not rely upon in the implementation of Fault Classification of the invention, therefore avoids tradition Complication system is difficult to the difficulty for modeling.

With reference to shown in Fig. 1, described relevant parameter Fault Classification is specifically included：

Step 1) obtain the various service datas that object is equipped；Described service data include direct fault location emulation data, Analog simulation data, bus monitoring data, BIT, IETM data, maintenance and detection record and existing sensing data etc..

Step 2) according to the related data of object equipment, object analysis are carried out, set up the parameter diagnostic rule of object equipment Storehouse.Rule base should include the diagnostic rule that object equips whole parameters, and the bound of such as including but not limited to parameter (specifies ginseng Several bound extreme values, more than then be failure criterion), parameter saltus step unusual determination rule (regulation parameter in the short time There is the situation of significantly saltus step in interior value, and determine saltus step degree and failure criterion), parameter trend gradual change unusual determination rule Then (sport the failure criterion for the improper trend such as being gradually reduced by being gradually increasing).

It should be noted that the completeness to ensure final argument association probability model, this parameter diagnostic rule storehouse is most Low requirement is the single decision rule comprising each parameter.Therefore need not require to object equipment set up accurate physical model with Try to achieve the associative expression formula of parameter.

Step 3) on the premise of parameter diagnostic rule storehouse is complete, on the basis of parameter diagnostic rule storehouse, screen step 1) , can now be input into for the diagnostic rule in parameter diagnostic rule storehouse and count by the abnormal data entry in the magnanimity service data of middle acquisition Calculation machine, is screened by computer automatic execution.The form of the service data should meet following several：

1st, each complete Data Entry should include the exact time of Data Entry generation and in moment object dress Standby all parameter values；

2nd, the single data values in each Data Entry should characterize a certain moment object equipment in an actual measurement for parameter Value；

3rd, arranged one by one according to the sequencing that the moment occurs between each Data Entry.

The fault data for filtering out should possess following form：

1st, the exact time that each entry occurs comprising the Data Entry；

2nd, each entry there occurs whole fault parameters of failure comprising the moment, in order to subsequently being mapped and being advised About；

3rd, for the parameter broken down in Data Entry, according to parameter diagnostic rule storehouse, mark failure occurs to be triggered Rule (thresholding rule, saltus step rule etc.).

The data for now obtaining are whole fault datas, are not classified.After fault data is obtained, data are carried out Cluster computing.

Step 4) will independently be clustered by the data that clustering algorithm carries out supervision without classification fault data collection, met It is required that clusters number and every cluster centre after, by every cluster centre for determining to the institute that is concentrated without classification fault data Faulty data are classified, and obtain classification fault data collection.

Cluster computing, according to the fault data isolated in previous step, is carried out machine and independently gathered using the method for K-Means Class computing.Wherein the first step, is also a most important step, is exactly the determination of K values (number of cluster core).K cluster core, it is actual What is characterized is exactly K kind failure situations.

The method that the present invention is optimized using silhouette coefficient, for choosing K values.The silhouette coefficient of certain cluster, refers to Average value of the corresponding vector point of all Data Entries included in the cluster to the geometric distance of the cluster centre.Clustering Cheng Hou, silhouette coefficient is lower, it was demonstrated that the classifying quality of the cluster is more outstanding.

With reference to shown in Fig. 3, described step 4) specifically include：

Step 101) since K=2, the initial number for setting clusters number K is 2, according to current K values to without classification Fault data collection carries out cluster computing, obtains K cluster centre and its corresponding K cluster.

Step 102) after the completion of computing is clustered, calculate under current K values, the K mean profile coefficient of cluster.It is poly- by K The mean profile coefficient of class with K-1 cluster mean profile coefficient compared with, when with the increase of K, silhouette coefficient is gradually received Hold back, when no longer reducing, that is, choose current K values as cluster sum, step 101 is re-executed after otherwise setting K=K+1).As schemed It is that K values are chosen shown in 2a, 2b, 2c, 2d, four experiments has been carried out respectively.In four experiments shown in the figure, with the increasing of K Plus, the change of silhouette coefficient can be gradually reduced.When K reaches 11, gradually restrain.

Based on above-mentioned steps 101), during true defining K value, for each current K value, in being both needed to be clustered The selection of the heart.First it is the selection of initial cluster center (seed point).For current K values, it is necessary to choose K seed point.Choosing Take comprising the following steps that for cluster centre：

Step 101-1) first random from all runtime databases of object equipment choose the corresponding vector of Data Entry O'clock as first cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as the second cluster Center.

Step 101-2) for each vector point, we calculate itself and a nearest geometric distance for cluster centre Distance (x), and be stored in an array, then these geometric distances Distance (x) are added and are obtained always apart from Sum (Distance(x))。

Step 101-3) random value is taken again, the next cluster centre of acquisition is calculated with the mode of weight.This is calculated The realization of method is to choose a corresponding vector point of Data Entry that can fall always in Sum (Distance (x)) immediately Random, Random=Random-Distance (x), until Random<When=0, point now is exactly the poly- of next selection Class center.Repeat step 101-2) and step 101-3), until k cluster centre is selected.

After cluster centre is chosen, next step is the training of cluster.For each fault sample data, its correspondence is calculated Vector point to the geometric distance of each cluster centre, be referred to closest cluster centre, then calculate update after Cluster geometric center, and the former center of the cluster is substituted with new geometric center.Check whether cluster centre changes, In the event of change (not converged), then said process is constantly repeated.When cluster centre convergence (no longer changing), cluster Computing is completed.

By above-mentioned computing, have chosen the K values of optimization, and after having carried out cluster computing, grasped in our hands Valid data include：Without classification fault data, the number K of cluster, the vector parameter of each cluster core and belong to each and cluster it The detailed entry of (being subordinated to the cluster) fault data for including down.

Followed by be mapping-stipulations computing, the purpose of the computing be in order to from the middle of the fault data of magnanimity, It was found that the Non-linear coupling fault correlation relation between parameter.

Step 5) by step 3) in use mapping-reduction algorithm, generation parameter association probability without classification fault data collection Model, while described parameter association probabilistic model breaks down comprising each parameter in object equipment, other specification is also sent out The probability distribution data of raw failure.

With reference to shown in Fig. 4, described step 5) specifically include：

Step 201) carry out mapping operations first, i.e., based on without classification fault data collection, carry out from discrete failure Mapping of the data to each parameter.According to the order of parameter, the whole fault data entries comprising each bar parameter are distinguished successively It is mapped to together, forms the corresponding mapping class of each bar parameter.The result of mapping operations is the whole number of faults comprising each parameter According to entry and its frequency of appearance.

By mapping operations, we have grasped fault entries and its frequency comprising each parameter respectively.For example, The all of fault entries broken down comprising parameter 1, (second layer left side in Fig. 4 during we have been mapped to first mapping ensemblen First mapping ensemblen).The all of fault entries broken down comprising parameter 2, we have been mapped to (figure in second mapping ensemblen Second mapping ensemblen in second layer left side in 4), by that analogy, obtain the mapping ensemblen of all parameters.

Based on the mapping class obtained by above-mentioned steps, stipulations computing is carried out.The purpose of stipulations computing, is calculated when certain The synchronization that one parameter breaks down, the probability that certain parameter also breaks down simultaneously in addition.Come between characterization parameter with this Fault correlation relation.

Step 202) for each class of above-mentioned mapping formation, calculate the sum of fault data entry in each mapping class (frequency addition), as the denominator of probability calculation.

Step 203) add up in each mapping class comprising the secondary of the other specification appearance in addition to mapping class correspondence parameter Number, its frequency is added, as the molecule of probability calculation.

Step 204) by step 203) in molecule and step 202) in the ratio between denominator, obtain each parameter and break down While, the probability distribution data that other specification also breaks down.It is (all to be broken down comprising parameter 1 with first mapping class Data combination) as a example by, in the mapping class, retrieval comprising parameter 2 combination, by its frequency be added, as molecule, divided by this The sum of class fault entries, while being broken down with this calculating parameter 1, the probability that parameter 2 also breaks down.Parameter is calculated After 2, calculating parameter 3 to parameter s (has traveled through all parameters).It is consequently formed the fault correlation parameter list of parameter 1.

By that analogy, from the 2nd to s-th mapping class, identical stipulations computing is carried out.Form the s fail close of parameter Connection parameter list.

Thus, the training part of data has completed, and we have grasped the cluster of the K kind failures of K-Means generations, and The parameter association probabilistic model of mapping-stipulations generation.Next can carry out actual failure and examine using service data collection is equipped Disconnected and checking.

Step 6) using step 4) in classification fault data collection as fault distinguishing standard, to step 1) in all operations Data carry out fault category identification using nearest neighbor algorithm, obtain failure modes result.During actual motion, for one The new service data entry of bar, can use nearest neighbor algorithm, and it is calculated respectively with the K geometry of the cluster centre of fault cluster Distance, takes the distance value (arest neighbors) of minimum.If this minimum value is less than the silhouette coefficient of the cluster, you can judge operation number According to the fault type corresponding to the cluster, fault diagnosis is carried out with this.

Step 7) by failure modes result with step 5) in parameter association probabilistic model be combined, draw comprehensive diagnosis As a result.Comprehensive diagnostic result includes：Failure modes result, main fault parameter and with major failure parameter association probability The parameter of larger (probability threshold value can be adjusted according to actual conditions).

In sum, the relevant parameter failure modes side analyzed based on big data Fusion of Clustering for being provided according to the present invention Method, realizes the intelligent fault classification and relevant parameter analysis excavated based on mass data.With the controllable failure of accuracy rate point Class ability.And for the failure for sorting out, according to parameter association probabilistic model, the association that can provide dependent failure parameter is general Rate, so as to improve the formulation of the intelligent diagnostics and maintenance decision of failure.

It should be noted last that, the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted.Although ginseng The present invention has been described in detail according to embodiment, it will be understood by those within the art that, to technical side of the invention Case is modified or equivalent, and without departure from the spirit and scope of technical solution of the present invention, it all should cover in the present invention Right in the middle of.

Claims

1. it is a kind of based on big data Fusion of Clustering analysis relevant parameter Fault Classification, it is characterised in that including：

Step 1) obtain the various service datas that object is equipped；

Step 2) according to the related data of object equipment, the parameter diagnostic rule storehouse of the whole parameters of object equipment is covered in foundation；

Step 3) be defined by the rule in parameter diagnostic rule storehouse, to step 1) in the screening of all of service data obtain number of faults According to by all fault data set formation without classification fault data collection；

Step 4) will independently be clustered by the data that clustering algorithm carries out supervision without classification fault data collection, obtain and meet requirement Clusters number and every cluster centre after, by every cluster centre for determining to concentrated without classification fault data all therefore Barrier data are classified, and obtain classification fault data collection；

Step 5) by step 3) in use mapping-reduction algorithm without classification fault data collection, generation parameter association probability model, While described parameter association probabilistic model breaks down comprising each parameter in object equipment, other specification also breaks down Probability distribution data；

Step 6) using step 4) in classification fault data collection as fault distinguishing standard, to step 1) in all service datas Fault category identification is carried out using nearest neighbor algorithm, failure modes result is obtained；

Step 7) by failure modes result with step 5) in parameter association probabilistic model be combined, obtain occur the failure modes The probability distribution data of all parameters of result.

2. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature Be, the step 1) in obtain service data form meet：Each complete Data Entry is sent out comprising the Data Entry Raw moment and all parameter values in moment object equipment；Single data values in each Data Entry characterize certain for the moment A measured value for parameter in the object-based device at quarter；Arranged one by one according to the sequencing that the moment occurs between each Data Entry.

3. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature Be, the step 3) in screening fault data form meet：Each Data Entry comprising the Data Entry occur when The whole fault parameters carved and broken down at the moment；For the parameter broken down in Data Entry, sentenced according to parameter Read rule base mark failure and triggered rule occurs.

4. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature It is that the bound of described parameter diagnostic rule storehouse comprising parameter, parameter saltus step unusual determination rule, parameter trend gradual change are different Normal decision rule.

5. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature It is, described step 4) specifically include：

Step 101) initial number that sets clusters number K is 2, according to current K values to being carried out without classification fault data collection Cluster computing, obtains K cluster centre and its corresponding K cluster；

Step 102) the K mean profile coefficient of cluster is calculated, the K mean profile coefficient of cluster is flat with what K-1 clustered Equal silhouette coefficient compares, if two mean profile coefficients are constant, chooses current K values as cluster sum, otherwise sets K Step 101 is re-executed after=K+1)；Described silhouette coefficient represent all Data Entries included in each cluster it is corresponding to Average value of the amount point to the geometric distance of cluster centre；

Step 103) with step 102) in the cluster sum that determines to carrying out cluster computing without classification fault data collection, and by obtaining The every cluster centre for taking is classified to all fault datas concentrated without classification fault data, obtains classification fault data Collection.

6. it is according to claim 5 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature Be, described step 101) in obtain cluster centre operating procedure include：

Step 101-1) random from all service datas of object equipment select the corresponding vector point conduct of Data Entry First cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as the second cluster centre；

Step 101-2) geometric distance Distance (x) of each cluster centre cluster centre nearest with it is calculated, will be all Geometric distance Distance (x) is added and obtains always apart from Sum (Distance (x))；

Step 101-3) randomly select a corresponding vector of Data Entry that can fall always in Sum (Distance (x)) Point Random, as the cluster centre for newly increasing, re-executes step 101-2), until picking out K cluster centre.

7. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature It is, described step 5) specifically include：

Step 201) the whole fault data entries comprising each bar parameter are respectively mapped to together, form each bar parameter pair successively The frequency of the mapping class answered, the whole fault data entries of described mapping class comprising parameter and its appearance；

Step 203) add up in each mapping class to include the number of times that the other specification in addition to mapping class correspondence parameter occurs, make It is the molecule of probability calculation；

Step 204) by step 203) in molecule and step 202) in the ratio between denominator, obtain that each parameter breaks down is same When, the probability distribution data that other specification also breaks down.

8. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature It is, described step 6) specifically include：Calculation procedure 1) in all service datas with determine every cluster centre it is several What distance, the distance value for taking minimum is compared with the mean profile coefficient of corresponding cluster, if the distance value is less than correspondence Cluster mean profile coefficient, then judge fault type of the service data corresponding to the cluster.