CN109271466A - A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm - Google Patents

A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm Download PDF

Info

Publication number
CN109271466A
CN109271466A CN201810999313.0A CN201810999313A CN109271466A CN 109271466 A CN109271466 A CN 109271466A CN 201810999313 A CN201810999313 A CN 201810999313A CN 109271466 A CN109271466 A CN 109271466A
Authority
CN
China
Prior art keywords
data
result
cluster
hierarchical clustering
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810999313.0A
Other languages
Chinese (zh)
Inventor
宋耀莲
马丽华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201810999313.0A priority Critical patent/CN109271466A/en
Publication of CN109271466A publication Critical patent/CN109271466A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The weather data analysis method based on hierarchical clustering Yu K mean algorithm that the present invention relates to a kind of, belongs to weather data analysis method and technology field.Regional meteorological measuring is collected first generates meteorogical phenomena database;Then: obtaining the data in meteorogical phenomena database, carry out processing analysis using K mean algorithm, and analysis result is stored into result database 1;The data in meteorogical phenomena database are obtained simultaneously, carry out processing analysis using hierarchical clustering algorithm, and analysis result is stored into result database 2;Again by obtaining the data in result database 1 and result database 2, the data in result database are analyzed using result verification method;Suitable climatic analysis prediction data is finally obtained based on the analysis results.The present invention is handled meteorological data using hierarchical clustering algorithm and K mean algorithm, and obtains suitably analyzing prediction result to promote the accuracy of weather data analysis prediction result by result verification method.

Description

A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm
Technical field
The weather data analysis method based on hierarchical clustering Yu K mean algorithm that the present invention relates to a kind of, belongs to meteorological data Analysis method technical field.
Background technique
As data mining and its continuous development of application, data mining are widely used in every field.It utilizes Ground instrument and remote sensing observations are to meteorological data related with atmospheric environment, the prediction knot obtained by the processing to these data Fruit all generates positive effect to the decision of many industries.But existing some weather data analysis prediction techniques suffer from standard The not high disadvantage of true property.
Summary of the invention
The technical problem to be solved by the present invention is to for overcome the deficiencies in the prior art, provide it is a kind of based on hierarchical clustering with The weather data analysis method of K mean algorithm, to solve the above problems.
The technical scheme is that a kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm, specifically Step are as follows:
Step1: it collects regional meteorological measuring and generates meteorogical phenomena database;
Step2: obtaining the data in meteorogical phenomena database, carries out processing analysis using K mean algorithm, and analysis result is deposited It stores up in result database 1;
Step3: obtaining the data in meteorogical phenomena database, carries out processing analysis using hierarchical clustering algorithm, and analysis is tied Fruit stores into result database 2;
Step4: the data in result database 1 and result database 2 are obtained, using result verification method to result data Data in library are analyzed;
Step5: suitable climatic analysis prediction data is obtained based on the analysis results.
In the step Step2, K mean algorithm is comprised the concrete steps that:
S1: K mass center of selection;
S2: each meteorological data point is assigned on its immediate mass center;
S3: recalculating mass center, and new mass center is the average value of all meteorological data points;
S4: data point is distributed into their immediate mass centers;
S5: continue step S3 and S4, until maximum number of iterations is not redistributed or reached to observed value.
The hierarchical clustering algorithm comprises the concrete steps that:
S1: each meteorological data point is distributed to the start cluster of oneself, and distributes n cluster for n object;
S2: the distance between all pairs of clusters are calculated using full concatenation measurement to create distance matrix;
S3: the smallest two clusters of distance between them are found out;
S4: the smallest cluster of the spacing found is merged.
S5: step S3 and S4 is constantly repeated when an only remaining cluster, stops algorithm.
In the step S2, the calculation formula of full concatenation measurement are as follows:
Wherein, X and Y represent cluster X and cluster Y, distance of the D (X, Y) between two clusters.
In the step Step4, result verification method are as follows: using profile width and Dunne's index come to both algorithms point The effect of analysis is verified, and the weather prognosis data that profile width and the biggish algorithm of Dunne's index obtain are more accurate.
The calculation of the profile width is
Wherein, aiIt is the average distance in i and same cluster between every other observation, biIt is that i and nearest neighbor cluster Average distance between middle observation, biCalculation formula are as follows:
Wherein, C (i) is the cluster comprising observation data i, and dist (i, j) is the distance between observation i and j, and n (C) is The radix of cluster C.
The calculation of Dunne's index are as follows:
Wherein, diam (Cm) it is cluster CmMaximum distance between middle observation.
The beneficial effects of the present invention are: the present invention carries out meteorological data using hierarchical clustering algorithm and K mean algorithm Processing, and obtain suitably analyzing prediction result to promote the standard of weather data analysis prediction result by result verification method True property.
Detailed description of the invention
Fig. 1 is flow chart of steps of the present invention.
Specific embodiment
With reference to the accompanying drawings and detailed description, the invention will be further described.
Embodiment 1: as shown in Figure 1, a kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm, specifically Step are as follows:
Step1: it collects regional meteorological measuring and generates meteorogical phenomena database;
Step2: obtaining the data in meteorogical phenomena database, carries out processing analysis using K mean algorithm, and analysis result is deposited It stores up in result database 1;
Step3: obtaining the data in meteorogical phenomena database, carries out processing analysis using hierarchical clustering algorithm, and analysis is tied Fruit stores into result database 2;
Step4: the data in result database 1 and result database 2 are obtained, using result verification method to result data Data in library are analyzed;
Step5: suitable climatic analysis prediction data is obtained based on the analysis results.
In the step Step2, K mean algorithm is comprised the concrete steps that:
S1: K mass center of selection;
S2: each meteorological data point is assigned on its immediate mass center;
S3: recalculating mass center, and new mass center is the average value of all meteorological data points;
S4: data point is distributed into their immediate mass centers;
S5: continue step S3 and S4, until maximum number of iterations is not redistributed or reached to observed value.
The hierarchical clustering algorithm comprises the concrete steps that:
S1: each meteorological data point is distributed to the start cluster of oneself, and distributes n cluster for n object;
S2: the distance between all pairs of clusters are calculated using full concatenation measurement to create distance matrix;
S3: the smallest two clusters of distance between them are found out;
S4: the smallest cluster of the spacing found is merged.
S5: step S3 and S4 is constantly repeated when an only remaining cluster, stops algorithm.
In the step S2, the calculation formula of full concatenation measurement are as follows:
Wherein, X and Y represent cluster X and cluster Y, distance of the D (X, Y) between two clusters.
In the step Step4, result verification method are as follows: using profile width and Dunne's index come to both algorithms point The effect of analysis is verified, and the weather prognosis data that profile width and the biggish algorithm of Dunne's index obtain are more accurate.
The calculation of the profile width is
Wherein, aiIt is the average distance in i and same cluster between every other observation, biIt is that i and nearest neighbor cluster Average distance between middle observation, biCalculation formula are as follows:
Wherein, C (i) is the cluster comprising observation data i, and dist (i, j) is the distance between observation i and j, and n (C) is The radix of cluster C.
The calculation of Dunne's index are as follows:
Wherein, diam (Cm) it is cluster CmMaximum distance between middle observation.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept Put that various changes can be made.

Claims (7)

1. a kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm, it is characterised in that:
Step1: it collects regional meteorological measuring and generates meteorogical phenomena database;
Step2: obtaining the data in meteorogical phenomena database, carries out processing analysis using K mean algorithm, and analysis result storage is arrived In result database 1;
Step3: obtaining the data in meteorogical phenomena database, carries out processing analysis using hierarchical clustering algorithm, and analysis result is deposited It stores up in result database 2;
Step4: the data in result database 1 and result database 2 are obtained, using result verification method in result database Data analyzed;
Step5: suitable climatic analysis prediction data is obtained based on the analysis results.
2. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist In: in the step Step2, K mean algorithm is comprised the concrete steps that:
S1: K mass center of selection;
S2: each meteorological data point is assigned on its immediate mass center;
S3: recalculating mass center, and new mass center is the average value of all meteorological data points;
S4: data point is distributed into their immediate mass centers;
S5: continue step S3 and S4, until maximum number of iterations is not redistributed or reached to observed value.
3. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist In: the hierarchical clustering algorithm comprises the concrete steps that:
S1: each meteorological data point is distributed to the start cluster of oneself, and distributes n cluster for n object;
S2: the distance between all pairs of clusters are calculated using full concatenation measurement to create distance matrix;
S3: the smallest two clusters of distance between them are found out;
S4: the smallest cluster of the spacing found is merged.
S5: step S3 and S4 is constantly repeated when an only remaining cluster, stops algorithm.
4. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist In: in the step S2, the calculation formula of full concatenation measurement are as follows:
Wherein, X and Y represent cluster X and cluster Y, distance of the D (X, Y) between two clusters.
5. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist In: in the step Step4, result verification method are as follows: both algorithms are analyzed using profile width and Dunne's index Effect is verified.
6. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist In: the calculation of the profile width are as follows:
Wherein, aiIt is the average distance in i and same cluster between every other observation, biIt is to be seen in i and nearest neighbor cluster Average distance between measured value, biCalculation formula are as follows:
Wherein, C (i) is the cluster comprising observation data i, and dist (i, j) is the distance between observation i and j, and n (C) is cluster The radix of C.
7. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist In: the calculation of Dunne's index are as follows:
Wherein, diam (Cm) it is cluster CmMaximum distance between middle observation.
CN201810999313.0A 2018-08-30 2018-08-30 A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm Pending CN109271466A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810999313.0A CN109271466A (en) 2018-08-30 2018-08-30 A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810999313.0A CN109271466A (en) 2018-08-30 2018-08-30 A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm

Publications (1)

Publication Number Publication Date
CN109271466A true CN109271466A (en) 2019-01-25

Family

ID=65154504

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810999313.0A Pending CN109271466A (en) 2018-08-30 2018-08-30 A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm

Country Status (1)

Country Link
CN (1) CN109271466A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110244099A (en) * 2019-06-24 2019-09-17 河南工业大学 Stealing detection method based on user's voltage
CN110647538A (en) * 2019-10-18 2020-01-03 成都淞幸科技有限责任公司 SOA-based climate observation data high-speed synthesis analysis method
CN110826623A (en) * 2019-11-04 2020-02-21 深圳雷霆应急科技有限公司 Classification method and device based on meteorological data, computer equipment and storage medium
CN115796564A (en) * 2023-02-13 2023-03-14 山东济矿鲁能煤电股份有限公司阳城煤矿 Coal mine work management system based on meteorological supervision

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105891629A (en) * 2016-03-31 2016-08-24 广西电网有限责任公司电力科学研究院 Transformer equipment fault identification method
CN107944604A (en) * 2017-11-10 2018-04-20 中国电力科学研究院有限公司 A kind of weather pattern recognition methods and device for photovoltaic power prediction
CN108415910A (en) * 2017-02-09 2018-08-17 中国传媒大学 Topic development cluster analysis system based on time series and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105891629A (en) * 2016-03-31 2016-08-24 广西电网有限责任公司电力科学研究院 Transformer equipment fault identification method
CN108415910A (en) * 2017-02-09 2018-08-17 中国传媒大学 Topic development cluster analysis system based on time series and method
CN107944604A (en) * 2017-11-10 2018-04-20 中国电力科学研究院有限公司 A kind of weather pattern recognition methods and device for photovoltaic power prediction

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110244099A (en) * 2019-06-24 2019-09-17 河南工业大学 Stealing detection method based on user's voltage
CN110647538A (en) * 2019-10-18 2020-01-03 成都淞幸科技有限责任公司 SOA-based climate observation data high-speed synthesis analysis method
CN110826623A (en) * 2019-11-04 2020-02-21 深圳雷霆应急科技有限公司 Classification method and device based on meteorological data, computer equipment and storage medium
CN110826623B (en) * 2019-11-04 2023-09-01 深圳雷霆应急科技有限公司 Classification method and device based on meteorological data, computer equipment and storage medium
CN115796564A (en) * 2023-02-13 2023-03-14 山东济矿鲁能煤电股份有限公司阳城煤矿 Coal mine work management system based on meteorological supervision

Similar Documents

Publication Publication Date Title
CN109271466A (en) A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm
CN108304668B (en) Flood prediction method combining hydrologic process data and historical prior data
CN108716918B (en) RSSI indoor positioning algorithm based on grid clustering
US9454595B2 (en) Data analysis apparatus and method
CN106680798B (en) A kind of identification of airborne LIDAR air strips overlay region redundancy and removing method
US9852360B2 (en) Data clustering apparatus and method
CN105844038B (en) A kind of highway polymorphic type traffic detector Combinatorial Optimization distribution method
CN111415752A (en) Hand-foot-and-mouth disease prediction method integrating meteorological factors and search indexes
CN110135642A (en) A kind of magnitude of traffic flow sequence similarity measure based on DTW distance
Biard et al. Automated detection of weather fronts using a deep learning neural network
CN109325510A (en) A kind of image characteristic point matching method based on lattice statistical
CN108985587B (en) Soft soil shield tunnel structure health condition evaluation method
CN105824987A (en) Wind field characteristic statistical distributing model building method based on genetic algorithm
CN113836808A (en) PM2.5 deep learning prediction method based on heavy pollution feature constraint
CN103902798B (en) Data preprocessing method
CN105678047A (en) Wind field characterization method with empirical mode decomposition noise reduction and complex network analysis combined
CN104112062A (en) Method for obtaining wind resource distribution based on interpolation method
CN105046203B (en) The adaptive hierarchy clustering method of satellite telemetering data based on angle DTW distances
CN114894804A (en) Method for detecting surface cracks of precision standard part
CN111461192B (en) River channel water level flow relation determination method based on multi-hydrological station linkage learning
CN104166806B (en) A kind of clustering method of inter-well tracer test curve and device
CN110110339A (en) A kind of hydrologic forecast error calibration method and system a few days ago
CN110689055B (en) Cross-scale statistical index spatialization method considering grid unit attribute grading
CN108932554B (en) Configuration optimization method and device for wind power plant flow field measurement points
CN107133636B (en) Method and system for obtaining similar typhoons

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190125

RJ01 Rejection of invention patent application after publication