CN109271466A - A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm - Google Patents
A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm Download PDFInfo
- Publication number
- CN109271466A CN109271466A CN201810999313.0A CN201810999313A CN109271466A CN 109271466 A CN109271466 A CN 109271466A CN 201810999313 A CN201810999313 A CN 201810999313A CN 109271466 A CN109271466 A CN 109271466A
- Authority
- CN
- China
- Prior art keywords
- data
- result
- cluster
- hierarchical clustering
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000007405 data analysis Methods 0.000 title claims abstract description 19
- 238000004458 analytical method Methods 0.000 claims abstract description 26
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000012795 verification Methods 0.000 claims abstract description 9
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000005259 measurement Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000007418 data mining Methods 0.000 description 2
- 238000004393 prognosis Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The weather data analysis method based on hierarchical clustering Yu K mean algorithm that the present invention relates to a kind of, belongs to weather data analysis method and technology field.Regional meteorological measuring is collected first generates meteorogical phenomena database;Then: obtaining the data in meteorogical phenomena database, carry out processing analysis using K mean algorithm, and analysis result is stored into result database 1;The data in meteorogical phenomena database are obtained simultaneously, carry out processing analysis using hierarchical clustering algorithm, and analysis result is stored into result database 2;Again by obtaining the data in result database 1 and result database 2, the data in result database are analyzed using result verification method;Suitable climatic analysis prediction data is finally obtained based on the analysis results.The present invention is handled meteorological data using hierarchical clustering algorithm and K mean algorithm, and obtains suitably analyzing prediction result to promote the accuracy of weather data analysis prediction result by result verification method.
Description
Technical field
The weather data analysis method based on hierarchical clustering Yu K mean algorithm that the present invention relates to a kind of, belongs to meteorological data
Analysis method technical field.
Background technique
As data mining and its continuous development of application, data mining are widely used in every field.It utilizes
Ground instrument and remote sensing observations are to meteorological data related with atmospheric environment, the prediction knot obtained by the processing to these data
Fruit all generates positive effect to the decision of many industries.But existing some weather data analysis prediction techniques suffer from standard
The not high disadvantage of true property.
Summary of the invention
The technical problem to be solved by the present invention is to for overcome the deficiencies in the prior art, provide it is a kind of based on hierarchical clustering with
The weather data analysis method of K mean algorithm, to solve the above problems.
The technical scheme is that a kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm, specifically
Step are as follows:
Step1: it collects regional meteorological measuring and generates meteorogical phenomena database;
Step2: obtaining the data in meteorogical phenomena database, carries out processing analysis using K mean algorithm, and analysis result is deposited
It stores up in result database 1;
Step3: obtaining the data in meteorogical phenomena database, carries out processing analysis using hierarchical clustering algorithm, and analysis is tied
Fruit stores into result database 2;
Step4: the data in result database 1 and result database 2 are obtained, using result verification method to result data
Data in library are analyzed;
Step5: suitable climatic analysis prediction data is obtained based on the analysis results.
In the step Step2, K mean algorithm is comprised the concrete steps that:
S1: K mass center of selection;
S2: each meteorological data point is assigned on its immediate mass center;
S3: recalculating mass center, and new mass center is the average value of all meteorological data points;
S4: data point is distributed into their immediate mass centers;
S5: continue step S3 and S4, until maximum number of iterations is not redistributed or reached to observed value.
The hierarchical clustering algorithm comprises the concrete steps that:
S1: each meteorological data point is distributed to the start cluster of oneself, and distributes n cluster for n object;
S2: the distance between all pairs of clusters are calculated using full concatenation measurement to create distance matrix;
S3: the smallest two clusters of distance between them are found out;
S4: the smallest cluster of the spacing found is merged.
S5: step S3 and S4 is constantly repeated when an only remaining cluster, stops algorithm.
In the step S2, the calculation formula of full concatenation measurement are as follows:
Wherein, X and Y represent cluster X and cluster Y, distance of the D (X, Y) between two clusters.
In the step Step4, result verification method are as follows: using profile width and Dunne's index come to both algorithms point
The effect of analysis is verified, and the weather prognosis data that profile width and the biggish algorithm of Dunne's index obtain are more accurate.
The calculation of the profile width is
Wherein, aiIt is the average distance in i and same cluster between every other observation, biIt is that i and nearest neighbor cluster
Average distance between middle observation, biCalculation formula are as follows:
Wherein, C (i) is the cluster comprising observation data i, and dist (i, j) is the distance between observation i and j, and n (C) is
The radix of cluster C.
The calculation of Dunne's index are as follows:
Wherein, diam (Cm) it is cluster CmMaximum distance between middle observation.
The beneficial effects of the present invention are: the present invention carries out meteorological data using hierarchical clustering algorithm and K mean algorithm
Processing, and obtain suitably analyzing prediction result to promote the standard of weather data analysis prediction result by result verification method
True property.
Detailed description of the invention
Fig. 1 is flow chart of steps of the present invention.
Specific embodiment
With reference to the accompanying drawings and detailed description, the invention will be further described.
Embodiment 1: as shown in Figure 1, a kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm, specifically
Step are as follows:
Step1: it collects regional meteorological measuring and generates meteorogical phenomena database;
Step2: obtaining the data in meteorogical phenomena database, carries out processing analysis using K mean algorithm, and analysis result is deposited
It stores up in result database 1;
Step3: obtaining the data in meteorogical phenomena database, carries out processing analysis using hierarchical clustering algorithm, and analysis is tied
Fruit stores into result database 2;
Step4: the data in result database 1 and result database 2 are obtained, using result verification method to result data
Data in library are analyzed;
Step5: suitable climatic analysis prediction data is obtained based on the analysis results.
In the step Step2, K mean algorithm is comprised the concrete steps that:
S1: K mass center of selection;
S2: each meteorological data point is assigned on its immediate mass center;
S3: recalculating mass center, and new mass center is the average value of all meteorological data points;
S4: data point is distributed into their immediate mass centers;
S5: continue step S3 and S4, until maximum number of iterations is not redistributed or reached to observed value.
The hierarchical clustering algorithm comprises the concrete steps that:
S1: each meteorological data point is distributed to the start cluster of oneself, and distributes n cluster for n object;
S2: the distance between all pairs of clusters are calculated using full concatenation measurement to create distance matrix;
S3: the smallest two clusters of distance between them are found out;
S4: the smallest cluster of the spacing found is merged.
S5: step S3 and S4 is constantly repeated when an only remaining cluster, stops algorithm.
In the step S2, the calculation formula of full concatenation measurement are as follows:
Wherein, X and Y represent cluster X and cluster Y, distance of the D (X, Y) between two clusters.
In the step Step4, result verification method are as follows: using profile width and Dunne's index come to both algorithms point
The effect of analysis is verified, and the weather prognosis data that profile width and the biggish algorithm of Dunne's index obtain are more accurate.
The calculation of the profile width is
Wherein, aiIt is the average distance in i and same cluster between every other observation, biIt is that i and nearest neighbor cluster
Average distance between middle observation, biCalculation formula are as follows:
Wherein, C (i) is the cluster comprising observation data i, and dist (i, j) is the distance between observation i and j, and n (C) is
The radix of cluster C.
The calculation of Dunne's index are as follows:
Wherein, diam (Cm) it is cluster CmMaximum distance between middle observation.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
Put that various changes can be made.
Claims (7)
1. a kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm, it is characterised in that:
Step1: it collects regional meteorological measuring and generates meteorogical phenomena database;
Step2: obtaining the data in meteorogical phenomena database, carries out processing analysis using K mean algorithm, and analysis result storage is arrived
In result database 1;
Step3: obtaining the data in meteorogical phenomena database, carries out processing analysis using hierarchical clustering algorithm, and analysis result is deposited
It stores up in result database 2;
Step4: the data in result database 1 and result database 2 are obtained, using result verification method in result database
Data analyzed;
Step5: suitable climatic analysis prediction data is obtained based on the analysis results.
2. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist
In: in the step Step2, K mean algorithm is comprised the concrete steps that:
S1: K mass center of selection;
S2: each meteorological data point is assigned on its immediate mass center;
S3: recalculating mass center, and new mass center is the average value of all meteorological data points;
S4: data point is distributed into their immediate mass centers;
S5: continue step S3 and S4, until maximum number of iterations is not redistributed or reached to observed value.
3. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist
In: the hierarchical clustering algorithm comprises the concrete steps that:
S1: each meteorological data point is distributed to the start cluster of oneself, and distributes n cluster for n object;
S2: the distance between all pairs of clusters are calculated using full concatenation measurement to create distance matrix;
S3: the smallest two clusters of distance between them are found out;
S4: the smallest cluster of the spacing found is merged.
S5: step S3 and S4 is constantly repeated when an only remaining cluster, stops algorithm.
4. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist
In: in the step S2, the calculation formula of full concatenation measurement are as follows:
Wherein, X and Y represent cluster X and cluster Y, distance of the D (X, Y) between two clusters.
5. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist
In: in the step Step4, result verification method are as follows: both algorithms are analyzed using profile width and Dunne's index
Effect is verified.
6. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist
In: the calculation of the profile width are as follows:
Wherein, aiIt is the average distance in i and same cluster between every other observation, biIt is to be seen in i and nearest neighbor cluster
Average distance between measured value, biCalculation formula are as follows:
Wherein, C (i) is the cluster comprising observation data i, and dist (i, j) is the distance between observation i and j, and n (C) is cluster
The radix of C.
7. the weather data analysis method according to claim 1 based on hierarchical clustering Yu K mean algorithm, feature exist
In: the calculation of Dunne's index are as follows:
Wherein, diam (Cm) it is cluster CmMaximum distance between middle observation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810999313.0A CN109271466A (en) | 2018-08-30 | 2018-08-30 | A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810999313.0A CN109271466A (en) | 2018-08-30 | 2018-08-30 | A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109271466A true CN109271466A (en) | 2019-01-25 |
Family
ID=65154504
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810999313.0A Pending CN109271466A (en) | 2018-08-30 | 2018-08-30 | A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109271466A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110244099A (en) * | 2019-06-24 | 2019-09-17 | 河南工业大学 | Stealing detection method based on user's voltage |
CN110647538A (en) * | 2019-10-18 | 2020-01-03 | 成都淞幸科技有限责任公司 | SOA-based climate observation data high-speed synthesis analysis method |
CN110826623A (en) * | 2019-11-04 | 2020-02-21 | 深圳雷霆应急科技有限公司 | Classification method and device based on meteorological data, computer equipment and storage medium |
CN115796564A (en) * | 2023-02-13 | 2023-03-14 | 山东济矿鲁能煤电股份有限公司阳城煤矿 | Coal mine work management system based on meteorological supervision |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105891629A (en) * | 2016-03-31 | 2016-08-24 | 广西电网有限责任公司电力科学研究院 | Transformer equipment fault identification method |
CN107944604A (en) * | 2017-11-10 | 2018-04-20 | 中国电力科学研究院有限公司 | A kind of weather pattern recognition methods and device for photovoltaic power prediction |
CN108415910A (en) * | 2017-02-09 | 2018-08-17 | 中国传媒大学 | Topic development cluster analysis system based on time series and method |
-
2018
- 2018-08-30 CN CN201810999313.0A patent/CN109271466A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105891629A (en) * | 2016-03-31 | 2016-08-24 | 广西电网有限责任公司电力科学研究院 | Transformer equipment fault identification method |
CN108415910A (en) * | 2017-02-09 | 2018-08-17 | 中国传媒大学 | Topic development cluster analysis system based on time series and method |
CN107944604A (en) * | 2017-11-10 | 2018-04-20 | 中国电力科学研究院有限公司 | A kind of weather pattern recognition methods and device for photovoltaic power prediction |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110244099A (en) * | 2019-06-24 | 2019-09-17 | 河南工业大学 | Stealing detection method based on user's voltage |
CN110647538A (en) * | 2019-10-18 | 2020-01-03 | 成都淞幸科技有限责任公司 | SOA-based climate observation data high-speed synthesis analysis method |
CN110826623A (en) * | 2019-11-04 | 2020-02-21 | 深圳雷霆应急科技有限公司 | Classification method and device based on meteorological data, computer equipment and storage medium |
CN110826623B (en) * | 2019-11-04 | 2023-09-01 | 深圳雷霆应急科技有限公司 | Classification method and device based on meteorological data, computer equipment and storage medium |
CN115796564A (en) * | 2023-02-13 | 2023-03-14 | 山东济矿鲁能煤电股份有限公司阳城煤矿 | Coal mine work management system based on meteorological supervision |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109271466A (en) | A kind of weather data analysis method based on hierarchical clustering Yu K mean algorithm | |
CN108304668B (en) | Flood prediction method combining hydrologic process data and historical prior data | |
CN108716918B (en) | RSSI indoor positioning algorithm based on grid clustering | |
US9454595B2 (en) | Data analysis apparatus and method | |
CN106680798B (en) | A kind of identification of airborne LIDAR air strips overlay region redundancy and removing method | |
US9852360B2 (en) | Data clustering apparatus and method | |
CN105844038B (en) | A kind of highway polymorphic type traffic detector Combinatorial Optimization distribution method | |
CN111415752A (en) | Hand-foot-and-mouth disease prediction method integrating meteorological factors and search indexes | |
CN110135642A (en) | A kind of magnitude of traffic flow sequence similarity measure based on DTW distance | |
Biard et al. | Automated detection of weather fronts using a deep learning neural network | |
CN109325510A (en) | A kind of image characteristic point matching method based on lattice statistical | |
CN108985587B (en) | Soft soil shield tunnel structure health condition evaluation method | |
CN105824987A (en) | Wind field characteristic statistical distributing model building method based on genetic algorithm | |
CN113836808A (en) | PM2.5 deep learning prediction method based on heavy pollution feature constraint | |
CN103902798B (en) | Data preprocessing method | |
CN105678047A (en) | Wind field characterization method with empirical mode decomposition noise reduction and complex network analysis combined | |
CN104112062A (en) | Method for obtaining wind resource distribution based on interpolation method | |
CN105046203B (en) | The adaptive hierarchy clustering method of satellite telemetering data based on angle DTW distances | |
CN114894804A (en) | Method for detecting surface cracks of precision standard part | |
CN111461192B (en) | River channel water level flow relation determination method based on multi-hydrological station linkage learning | |
CN104166806B (en) | A kind of clustering method of inter-well tracer test curve and device | |
CN110110339A (en) | A kind of hydrologic forecast error calibration method and system a few days ago | |
CN110689055B (en) | Cross-scale statistical index spatialization method considering grid unit attribute grading | |
CN108932554B (en) | Configuration optimization method and device for wind power plant flow field measurement points | |
CN107133636B (en) | Method and system for obtaining similar typhoons |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190125 |
|
RJ01 | Rejection of invention patent application after publication |