CN109639463A - A kind of determination method of Internet of Things monitoring point neighbouring relations - Google Patents

A kind of determination method of Internet of Things monitoring point neighbouring relations Download PDF

Info

Publication number
CN109639463A
CN109639463A CN201811407765.1A CN201811407765A CN109639463A CN 109639463 A CN109639463 A CN 109639463A CN 201811407765 A CN201811407765 A CN 201811407765A CN 109639463 A CN109639463 A CN 109639463A
Authority
CN
China
Prior art keywords
monitoring
monitoring point
monitoring data
data sequence
neighbouring relations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811407765.1A
Other languages
Chinese (zh)
Inventor
李永飞
田立勤
赵巧芳
陈振国
郭晓欣
王德志
王养廷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Institute of Science and Technology
Original Assignee
North China Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Institute of Science and Technology filed Critical North China Institute of Science and Technology
Priority to CN201811407765.1A priority Critical patent/CN109639463A/en
Publication of CN109639463A publication Critical patent/CN109639463A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/46Cluster building

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A kind of determination method of Internet of Things monitoring point neighbouring relations, the method reads the Historical Monitoring data of each monitoring point in setting time window first, obtain monitoring data sequence sets, then the monitoring data sequence in monitoring data sequence sets is clustered using a variety of clustering algorithms, and every kind of clustering algorithm passes through change number of clusters and measures multiple cluster results, the silhouette coefficient of every kind of cluster result is calculated later, and using the maximum cluster result of silhouette coefficient as optimal result, the neighbouring relations of Internet of Things monitoring point are finally judged according to optimal result.The present invention is based on Historical Monitoring data, use clustering algorithm, its Logic adjacent relationship is determined according to the inherent similitude between each data of monitoring point, experimental result is shown, the monitoring point neighbouring relations that this method is determined are stablized, it is more in line with objective reality compared to conventional method, more scientific and reasonable foundation can be provided for the validity examination of Internet of Things monitoring data and other data processings.

Description

A kind of determination method of Internet of Things monitoring point neighbouring relations
Technical field
The Analysis And Evaluation method for the Internet of Things monitoring point neighbouring relations based on cluster that the present invention relates to a kind of, belongs to data It excavates and Internet of Things monitoring technical field.
Background technique
In current all kinds of monitoring system of internet of things, due to by awareness apparatus and transmission network failure, it is even artificial The influence of the factors such as intentional, in the prevalence of a large amount of invalid or abnormal data.Such as in air quality real-time monitoring system, About there are 0.95%~3.18% all kinds of abnormal datas.These abnormal datas influence overall data availability, need into Row data validity examination.When determining data exception and being modified to abnormal data, it usually needs referring to neighbor monitoring and detecting The similar monitor value of point.For example, when noting abnormalities data, using the monitoring data average value of neighbor monitoring and detecting point (at generality Reason) or maximum value (punitive processing) exceptional value is modified.Therefore, the neighbouring relations for determining Internet of Things monitoring point, are objects A basic problem that must be solved in networking monitoring dealing of abnormal data.
Existing Internet of Things smp data processing system is usually that the administrative region according to belonging to monitoring point or place are geographical Judgment basis of the position as neighbouring relations.This determination method meaning is intuitive and realizes simply, but due to many administrative regions Shape very irregular, other node geo hypertelorisms in part monitoring point and same adjacent area, monitor value is different Less, monitoring object is complicated and changeable in addition, causes existing method simultaneously for reference value when regular data determines and exceptional value is corrected Actual needs cannot be met well, it is therefore necessary to explore more scientific and reasonable determination method.
Summary of the invention
It is an object of the invention to aiming at the disadvantages of the prior art, provide a kind of judgement of Internet of Things monitoring point neighbouring relations Method provides more scientific and reasonable foundation for the validity examination of Internet of Things monitoring data.
Problem of the present invention is realized with following technical proposals:
A kind of determination method of Internet of Things monitoring point neighbouring relations, the method read each in setting time window first The Historical Monitoring data of monitoring point, obtain monitoring data sequence sets, then using a variety of clustering algorithms to monitoring data sequence sets In monitoring data sequence clustered, and every kind of clustering algorithm pass through change number of clusters measure multiple cluster results, later The silhouette coefficient of every kind of cluster result is calculated, and using the maximum cluster result of silhouette coefficient as optimal result, it is last according to most Excellent result judges the neighbouring relations of Internet of Things monitoring point.
The determination method of above-mentioned Internet of Things monitoring point neighbouring relations, the described method comprises the following steps:
A. monitoring data are extracted
Then setting time window first reads the Historical Monitoring data of each monitoring point in setting time window, it is assumed that There is K monitoring point, indicates the monitoring data sequence read from i-th of monitoring point with Di, obtain monitoring data sequence sets D={ D1, D2,……DK};
B. number of clusters amount is determined
Cluster result number of clusters range is set as n1~n2, n1And n2It is natural number, and n1< n2
C. clustering is carried out
1. specified clustering algorithm set;
2. number of clusters amount is set as n1
3. to the monitoring data sequence in monitoring data sequence sets successively using various poly- in specified clustering algorithm set Class algorithm is clustered;
4. the numerical value of number of clusters amount is added 1, the operation of step 3. is repeated, until number of clusters amount is n2
5. calculating the silhouette coefficient of each cluster result;
D. determine neighbouring relations
The maximum cluster result of silhouette coefficient is chosen as optimal result, then is included into the monitoring point of same cluster in optimal result Adjacent monitoring point each other.
The determination method of above-mentioned Internet of Things monitoring point neighbouring relations, to the monitoring data sequence in monitoring data sequence sets into When row cluster, the calculation method of the distance between each monitoring data sequence is as follows:
For monitoring data sequence sets D={ D1,D2,……DKIn monitoring data sequence DiAnd Dj, definition is between the two Distance are as follows:
Wherein n is monitoring data sequence length, DimFor monitoring data sequence DiIn M dimension data, DjmFor monitoring data sequence DjIn m dimension data.
The determination method of above-mentioned Internet of Things monitoring point neighbouring relations, the calculation method of the silhouette coefficient of cluster result are as follows:
The silhouette coefficient of i-th of object in data set are as follows:
Wherein, aiIt is the average distance of i-th of object other objects into the cluster where it, bi It is i-th of object to the minimum value in the average distance of other clusters;
The average value for calculating the silhouette coefficient of all objects in data set, obtains the silhouette coefficient of cluster result.
The determination method of above-mentioned Internet of Things monitoring point neighbouring relations, when setting the number of clusters range of cluster result, n1And n2It is flat Mean value is closestNumber, wherein K is the number of monitoring point.
The present invention is based on Historical Monitoring data, using clustering algorithm, according to the inherent similitude between each data of monitoring point Determine that its Logic adjacent relationship, experimental result are shown, the monitoring point neighbouring relations that this method is determined are stable and have good Good interpretation, is more in line with objective reality compared to conventional method, can for the validity examination of Internet of Things monitoring data and its Its data processing provides more scientific and reasonable foundation.
Detailed description of the invention
The invention will be further described with reference to the accompanying drawing.
Fig. 1 is flow chart of the invention;
Fig. 2 is monitoring point distribution map.
Specific embodiment
The related knowledge of the neighbouring relations of Internet of Things monitoring point
Define the neighbouring relations of 1. monitoring points: the equivalence relation R defined on Internet of Things monitoring point set A meets reflexive Property, symmetry and transitivity, referred to as monitoring point a neighbouring relations.
Define the adjacent area of 2. monitoring points: the R equivalence class [a] that Internet of Things monitoring point a is formed on monitoring point set AR, The referred to as adjacent area of monitoring point a.
Define the neighbor node of 3. monitoring points: Internet of Things monitors in point set A, belongs to an adjacent area with monitoring point a Other monitoring points, the referred to as neighbor node of monitoring point a.
Define the adjacent sectors of 4. monitoring points: Internet of Things monitors a division of point set A, referred to as a kind of phase of monitoring point Adjacent subregion.
About the neighbouring relations of Internet of Things monitoring point, there is following theorem.
Theorem 1: Internet of Things monitors quotient set A/R of the point set A about neighbouring relations R, is that one kind of monitoring point set A is adjacent Subregion.
Prove: quotient set A/R is the equivalence class set of neighbouring relations R, that is, A/R={ [x]R| x ∈ A } wherein equivalence class [x]R =y ∈ A | (x, y) ∈ R }.
And the division of A is the set { A of its nonvoid subseti, and meet the following conditions: AiIAj=φ, i ≠ j;YAi=A.
Will be proven below quotient set A/R is a division for monitoring point set A.
Firstly,There is [x]RNon-empty;
Secondly,[if x]R≠[y]R, then have [x]RI[y]R=φ;
Finally,HaveThereforeHave againSo
From the foregoing, it will be observed that quotient set A/R is a division for monitoring point set A
According to defining 4, Internet of Things monitors quotient set A/R of the point set A about neighbouring relations R, is the one kind for monitoring point set A Adjacent sectors.
Property 1: the adjacent sectors of monitoring point are an adjacent area set.
According to theorem 1, quotient set A/R is an adjacent sectors for monitoring point set A.Because quotient set A/R is neighbouring relations R Equivalence class set, so adjacent sectors are a R equivalence class set.
Again according to defining 2, adjacent area is R equivalence class.So the adjacent sectors of monitoring point are an adjacent area set.
2: one neighbouring relations of property correspond to a kind of adjacent sectors;A kind of corresponding neighbouring relations in adjacent sectors.
According to defining 1, neighbouring relations are the equivalence relations monitored on point set A.
According to defining 4, adjacent sectors are a divisions for monitoring point set A.
By the one-to-one relationship between equivalence relation and division, it is known that a neighbouring relations correspond to a kind of adjacent sectors; A kind of corresponding neighbouring relations in adjacent sectors.
By aforementioned theorem and property it is found that monitoring point set for Internet of Things, as long as giving a neighbouring relations, so that it may It determines a kind of adjacent sectors, and then determines the adjacent area and its neighbor node where each monitoring point.
It defines the administrative neighbor node of 5. monitoring points: monitoring point neighbouring relations being defined as to belong to same administrative region, with The administrative neighbor node that there is monitoring point a the monitoring point of the neighbouring relations to be referred to as monitoring point a.
For example, the monitoring point in same city-level administrative region is divided into an adjacent area.This neighbouring relations determine The benefit of method is consistent with each monitoring point administrative jurisdiction system, convenient for management.But the shape of many administrative regions is very Irregularly, this will lead to other node geo hypertelorisms in part monitoring point and same adjacent area, and monitor value is different Reference value when regular data determines and exceptional value is corrected is little.
It defines the geographical neighbor node of 6. monitoring points: several geographic center points is selected in monitoring range, monitoring point phase Adjacent contextual definition is to be less than designated value with geographic center point distance.With monitoring point a there is the monitoring point of the neighbouring relations to be referred to as to supervise The geographical neighbor node of measuring point a.
This neighbouring relations determination method can be avoided administrative region caused problem in irregular shape.But by dividing Actual monitoring data are analysed it can be found that monitoring point similar in geographic distance, the difference of monitoring data may also be very big;Geography away from From farther away monitoring point, there is also the close phenomenons of data.Such as in air quality surveillance, since its influence factor is numerous, And influencing mechanism is complicated.The close monitoring point of the geographical positional distance in part, the air quality on periphery but differs greatly, also uncomfortable It closes cross-referenced.
Define the physical neighborhood node of 7. monitoring points: using Internet of Things monitoring point in real world it is already existing certain Relationship is as neighbouring relations, the neighbor node determined therefrom that, referred to as the physical neighborhood node of monitoring point.Administration neighbours above-mentioned Node and geographical neighbor node belong to physical neighborhood node.
The neighbouring relations of physical neighborhood node are determined based on certain existing rule, and realization is easier.But due to The reference value of part neighbor monitoring and detecting point is little, so being monitored data processing, actual effect based on this neighbouring relations It is often and unreasonable.This is because the internal association of used neighbouring relations and monitoring object may be not consistent, so The substantive characteristics of monitoring data is not can accurately reflect.
Neighbouring relations based on cluster determine
According to property 2 it is found that if being capable of providing a kind of more reasonable adjacent sectors, so that it may determine one preferably Neighbouring relations.In order to overcome physical neighborhood node existing deficiency in terms of data validity analysis, consider to be based on Historical Monitoring Data realize the judgement of neighbouring relations according to the feature of data itself.
Define the logical neighbors node of 8. monitoring points: using clustering method, the spy based on Historical Monitoring data itself Monitoring point set is divided into one group of adjacent area by sign, and the adjacent pass of monitoring point is determined further according to obtained adjacent sectors System.Neighbor node with the neighbouring relations is referred to as the logical neighbors node of monitoring point.1. the cluster of Internet of Things monitoring data Analysis
The citation form of Internet of Things monitoring is that one group of monitoring point is disposed in particular range, installs one group in each monitoring point Sensor acquires monitoring data.One group of monitor value that obtained monitoring data are usually saved in the form of time series, one As data format be shown in Table 1.It is assumed herein that each monitoring point is mounted with N kind sensor, using hour as data acquisition intervals.That , each monitoring point can generate one group of monitoring data every a hour.
The monitoring data format of certain the Internet of Things monitoring point of table 1
Clustering is that a kind of method of sub-clustering is carried out according to Sample Similarity, and target is to realize Sample Similarity in cluster Sample Similarity is minimum between maximum, cluster.Monitoring point can be determined about the adjacent of a certain parameter T using the method for clustering Relationship.The parameter T monitoring value sequence of all monitoring points is taken out, which can describe number of the monitoring point about parameter T According to feature.By carrying out clustering to all monitoring value sequences, all monitoring points can be included into different clusters, use institute Judgment basis of the obtained cluster result as monitoring point neighbouring relations.
2 monitoring point Logic adjacent relationship decision algorithms
The algorithm for realizing that Internet of Things monitoring point neighbouring relations determine using clustering is as shown in Figure 1.
Concrete processing procedure is as follows:
(1) monitoring data are extracted;The basic format of monitoring data is as shown in table 1.Here it is with air quality surveillance data Example, illustrates extraction process.Monitoring object is 8 class Air Pollutants, and data mode is small hourly value.The atmosphere of certain monitoring point Monitoring data form is as shown in table 2.By taking each monitoring point determines about the neighbouring relations of PM2.5 as an example, each monitoring point is every It generate 24 monitor values, if using n days historical datas, described with 24 × n monitor value the monitoring point about The data characteristics of PM2.5.These monitor values constitute a data sequence.
Certain the air quality surveillance point monitoring data of table 2
Same treatment carried out to all data of monitoring point, the available one group data sequence for describing each monitoring point.
(2) number of clusters amount is determined;In clustering, determine that number of clusters amount is a critical issue.Generally according to business demand Or motivation is analyzed to determine number of clusters, or use empirical valueK is object sum to be analyzed.It can also be using different Number of clusters carry out clustering after, calculate evaluation index or analysis indexes variation tendency based on the analysis results, then determine therefrom that conjunction Suitable number of clusters.
(3) clustering is carried out;Selecting suitable clustering algorithm is also a key factor of impact analysis result.In reality In the application of border, the concrete conditions such as combined data type, cluster purpose is needed to be selected.
(4) determine neighbouring relations;
Cluster analysis result is arranged, monitoring point in same cluster neighbor node each other is included into, constitutes an adjacent area.According to This can determine that the neighbouring relations between each monitoring point.
3. the definition of sample distance
Measurement of the sample distance for realizing sample similarity, and the foundation as clustering.Traditional distance definition Including Euclidean distance, manhatton distance etc..Better analytical effect in order to obtain has scholar to have studied score norm, DTW respectively (Dynamic Time Warping, dynamic time consolidation) distance, real compensation editing distance etc. are in terms of sample similarity measurement Application.In fact, the mode of distance definition is directly related with the characteristics of objects of clustering and analysis target, it is difficult to find one Kind is suitble to the similarity measurement mode of all clusterings.
The purpose that the present invention carries out clustering to monitoring data is to find between the monitoring data of different monitoring points The close degree of numerical value.For this purpose, it is as follows to define sample distance:
Define the distance of 9. monitoring data sequences: for monitoring data sequence DiAnd Dj, define distance between the two are as follows:
Wherein n is monitoring data sequence length, DimFor monitoring data sequence DiIn M dimension data, DjmFor monitoring data sequence DjIn m dimension data
This distance definition is that the data difference of all corresponding dimensions of two data sequences is summed, and is finally taken absolutely Value.
4. algorithm and the selection of number of clusters amount based on silhouette coefficient
Silhouette coefficient is investigated using the similarity measurement of object in data set poly- in the case where no base condition Separation property between compactedness and cluster in cluster, assesses cluster result in class result.
Define 10. silhouette coefficients: the silhouette coefficient of i-th of object in data set are as follows:
Wherein, aiIt is the average distance of i-th of object other objects into the cluster where it, bi It is i-th of object to the minimum value in the average distance of other clusters.
The value of S (i) between -1 to 1, closer to 1 explanation i-th of object where cluster compactedness it is better, and with it is other Cluster further away from.If value close to 0, indicate to distinguish between cluster it is unobvious, if close to -1, then it represents that sub-clustering mistake.Number can be used Evaluation index according to the silhouette coefficient average value of all objects of concentration as clustering result quality.
In the neighbouring relations decision algorithm of monitoring point, it is thus necessary to determine that the number of clusters of cluster result, and cluster appropriate is selected to calculate Method.Since the adjacencies of Internet of Things monitoring point are unknown, so the present invention is using silhouette coefficient as determining number of clusters and choosing Select the foundation of algorithm.Specific practice is: carrying out multiple clustering using a variety of clustering algorithms and different number of clusters, finds out respectively Its silhouette coefficient, contouring coefficient the maximum is as final result.
Experimental result and analysis
Using monitoring point neighbouring relations decision algorithm of the invention, the hierarchical clustering algorithm provided using R language is to monitoring Point carries out the neighbouring relations decision analysis about PM2.5.
1. experimental data
Experimental data uses 28 monitoring points, 30 days PM2.5 monitoring data on Beijing periphery, and Fig. 2 is these monitoring points Location map.These monitoring points are substantially evenly looped around Areas around Beijing, the existing Plain of locating geographical environment and mountain area, Cover industry prosperity area and agricultural production area.Random number is carried out to these monitoring points, is represented respectively with 1~28, Make corresponding mark in Fig. 2.The PM2.5 monitoring number in 28 30 days monitoring point months is extracted from primary monitoring data According to as experimental data set.
2. experimental result
Using hierarchical clustering algorithm, take respectively between cluster apart from measure be complete, average, simple, Ward, median, mcquitty etc., number of clusters amount carry out clustering using 3~6 pairs of experimental data sets.Table 3 is each cluster knot The silhouette coefficient of fruit.
3 clustering silhouette coefficient table of table
It can be seen that number of clusters K takes Clustering Effect when 5 best using average distance (average).Table 4 to table 7 is distinguished Give optimum cluster result when number of clusters K is 3~6.
Table 4K=3, method=complete cluster result
Table 5K=4, method=complete cluster result
Table 6K=5, method=average cluster result
Table 7K=6, method=ward cluster result
3. result is analyzed in control methods
As a comparison, the judgement result of administrative neighbouring relations and geographical neighbouring relations is set forth in table 8 and table 9.
During administrative neighbouring relations determine, 28 monitoring points are divided into northern, east and middle part according to affiliated administrative region Three regions.
The administrative neighbouring relations of table 8 analyze result
The judgement of geographical neighbor node specifies 5 geographic center points according to principle is uniformly distributed in entire overlay area, Then all monitoring points are divided by 5 different adjacent areas according to geographic distance.
The geographical neighbouring relations of table 9 analyze result
4. analysis of experimental results
Experimental result is shown, determines neighbouring relations, in the obtained result of various algorithms, profile system using clustering Number both greater than 0.5 illustrates that separating degree is all relatively more reasonable between compactness and cluster in its cluster.
When number of clusters is 3, the monitoring points in three sub-clusterings A, B, C are respectively 11,9,8, and the size of cluster is more balanced;Cluster When number is 4, the cluster A in table 4 is divided into two clusters, monitoring point 13,14 independent clusters, other two cluster remains unchanged;Number of clusters When being 5, monitoring point 9,26 is separated out independent cluster, and other clusters are held essentially constant;Cluster B and cluster C quilt when number of clusters is 6, in table 6 Three clusters are divided into, other clusters remain unchanged.Here cluster name claims A~F to be used only to distinguish the label of cluster result, does not include Quality judgement.It can be seen that each number of clusters amount is more balanced in the result of clustering, with the increase of number of clusters, distinguishes and get between cluster Come thinner, the composition of each cluster keeps in logic consistent.
Cluster analysis result is compared with physics neighbouring relations result, the A class in 3 class of Logic adjacent and it is administrative it is adjacent in North zone be overlapped larger, the A class during A class in 5 class of Logic adjacent is adjacent with geography also has larger be overlapped.This is because row North zone during political affairs are adjacent and it is geographical it is adjacent in the A class monitoring point that is included be located in Bashang Grassland and Wild jujube in Taihang Mountain Area, work Industry degree is generally lower, so the air quality of these monitoring points is all relatively good.Therefore, there is also in logic for these monitoring points Neighbouring relations, so occurring being overlapped more phenomenon.
For the division result of other each adjacent areas, experimental result and physics neighbouring relations analysis result difference are larger. Two kinds of physics neighbouring relations determine the silhouette coefficient of result all 0.1 or so, illustrate its sub-clustering and unreasonable, this and point before Analysis conclusion is consistent.
Clustering can according in data feature, unlabelled sample is divided by several clusters based on similarity, The rule that data itself imply is objectively responded.The present invention uses level by the monitoring data sequence of extraction special parameter Clustering algorithm carries out clustering to part air quality surveillance point.Experimental result is shown, is determined according to cluster analysis result Monitoring point neighbouring relations stablize, and reasonable interpretation can be made in conjunction with reality, there is good interpretation, compare Traditional way that neighbouring relations are determined according to administrative region or geographical location, is more in line with objective reality, can be Internet of Things Monitoring data validity examination and other data processings provide more scientific and reasonable processing foundation.

Claims (5)

1. a kind of determination method of Internet of Things monitoring point neighbouring relations, characterized in that the method reads setting time window first The Historical Monitoring data of each monitoring point, obtain monitoring data sequence sets in mouthful, then using a variety of clustering algorithms to monitoring number It is clustered according to the monitoring data sequence in sequence sets, and every kind of clustering algorithm passes through change number of clusters and measures multiple clusters knots Fruit calculates the silhouette coefficient of every kind of cluster result later, and using the maximum cluster result of silhouette coefficient as optimal result, finally The neighbouring relations of Internet of Things monitoring point are judged according to optimal result.
2. a kind of determination method of Internet of Things monitoring point neighbouring relations according to claim 1, characterized in that the method The following steps are included:
A. monitoring data are extracted
Then setting time window first reads the Historical Monitoring data of each monitoring point in setting time window, it is assumed that have K Monitoring point indicates the monitoring data sequence read from i-th of monitoring point with Di, obtains monitoring data sequence sets D={ D1, D2,……DK};
B. number of clusters amount is determined
Cluster result number of clusters range is set as n1~n2, n1And n2It is natural number, and n1< n2
C. clustering is carried out
1. specified clustering algorithm set;
2. number of clusters amount is set as n1
3. successively being calculated using the various clusters in specified clustering algorithm set the monitoring data sequence in monitoring data sequence sets Method is clustered;
4. the numerical value of number of clusters amount is added 1, the operation of step 3. is repeated, until number of clusters amount is n2
5. calculating the silhouette coefficient of each cluster result;
D. determine neighbouring relations
The maximum cluster result of silhouette coefficient is chosen as optimal result, then is included into the monitoring point of same cluster in optimal result each other Adjacent monitoring point.
3. a kind of determination method of Internet of Things monitoring point neighbouring relations according to claim 1 or 2, characterized in that prison When monitoring data sequence in measured data sequence sets is clustered, the calculation method of the distance between each monitoring data sequence is such as Under:
For monitoring data sequence sets D={ D1,D2,……DKIn monitoring data sequence DiAnd Dj, define between the two away from From are as follows:
Wherein n is monitoring data sequence length, DimFor monitoring data sequence DiIn m Dimension data, DjmFor monitoring data sequence DjIn m dimension data.
4. a kind of determination method of Internet of Things monitoring point neighbouring relations according to claim 3, characterized in that cluster result Silhouette coefficient calculation method it is as follows:
The silhouette coefficient of i-th of object in data set are as follows:
Wherein, aiIt is the average distance of i-th of object other objects into the cluster where it, biIt is i-th A object is to the minimum value in the average distance of other clusters;
The average value for calculating the silhouette coefficient of all objects in data set, obtains the silhouette coefficient of cluster result.
5. a kind of determination method of Internet of Things monitoring point neighbouring relations according to claim 4, characterized in that setting cluster When number of clusters range as a result, n1And n2Average value be it is closestNumber, wherein K is the number of monitoring point.
CN201811407765.1A 2018-11-23 2018-11-23 A kind of determination method of Internet of Things monitoring point neighbouring relations Pending CN109639463A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811407765.1A CN109639463A (en) 2018-11-23 2018-11-23 A kind of determination method of Internet of Things monitoring point neighbouring relations

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811407765.1A CN109639463A (en) 2018-11-23 2018-11-23 A kind of determination method of Internet of Things monitoring point neighbouring relations

Publications (1)

Publication Number Publication Date
CN109639463A true CN109639463A (en) 2019-04-16

Family

ID=66069442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811407765.1A Pending CN109639463A (en) 2018-11-23 2018-11-23 A kind of determination method of Internet of Things monitoring point neighbouring relations

Country Status (1)

Country Link
CN (1) CN109639463A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109975230A (en) * 2019-05-16 2019-07-05 北京印刷学院 Pollutant on-line detecting system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183796A (en) * 2015-08-24 2015-12-23 同济大学 Distributed link prediction method based on clustering
CN107909111A (en) * 2017-11-24 2018-04-13 中国地质大学(武汉) A kind of multilevel scheme clustering method of settlement place polygon
US20180113929A1 (en) * 2016-10-26 2018-04-26 Salesforce.Com, Inc. Data Clustering and Visualization with Determined Group Number

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183796A (en) * 2015-08-24 2015-12-23 同济大学 Distributed link prediction method based on clustering
US20180113929A1 (en) * 2016-10-26 2018-04-26 Salesforce.Com, Inc. Data Clustering and Visualization with Determined Group Number
CN107909111A (en) * 2017-11-24 2018-04-13 中国地质大学(武汉) A kind of multilevel scheme clustering method of settlement place polygon

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
AINA MUSDHOLIFAH等: "Triangular Kernel Nearest Neighbor Based Clustering for Pattern Extraction in Spatio-Temporal Database", 《2010 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109975230A (en) * 2019-05-16 2019-07-05 北京印刷学院 Pollutant on-line detecting system and method
CN109975230B (en) * 2019-05-16 2021-09-17 北京印刷学院 On-line detection system and method for concentration of atmospheric pollutants

Similar Documents

Publication Publication Date Title
Niu et al. Measuring urban poverty using multi-source data and a random forest algorithm: A case study in Guangzhou
Hecht et al. Automatic identification of building types based on topographic databases–a comparison of different data sources
CN111651545A (en) Urban marginal area extraction method based on multi-source data fusion
CN114997534B (en) Similar rainfall forecasting method and equipment based on visual features
CN108268901B (en) Method for discovering environmental monitoring abnormal data based on dynamic time bending distance
CN112820415B (en) GIS-based chronic disease spatial-temporal evolution feature analysis and environmental health risk monitoring system and method
CN111984701B (en) Country aggregation evolution prediction method, device, equipment and storage medium
CN110516754A (en) A kind of hyperspectral image classification method based on multiple dimensioned super pixel segmentation
CN108898244B (en) Digital signage position recommendation method coupled with multi-source elements
CN109639463A (en) A kind of determination method of Internet of Things monitoring point neighbouring relations
CN113901348A (en) Oncomelania snail distribution influence factor identification and prediction method based on mathematical model
CN114707785A (en) Rural residential point multi-scale spatial feature analysis method based on deep learning
CN113240209A (en) Urban industry cluster development path prediction method based on graph neural network
CN108268646A (en) A kind of method that quality examination is carried out to encryption automatic weather station observed temperature numerical value
CN108647189B (en) Method and device for identifying user crowd attributes
CN116662840A (en) Low-voltage station user phase identification method based on machine learning
CN115457386A (en) Village land informatization generation method
CN112925784B (en) Multi-scale spatialization method for real population data
Yan et al. A new approach for identifying urban employment centers using mobile phone data: A case study of Shanghai
CN112488236B (en) Integrated unsupervised student behavior clustering method
Liu et al. China's oases have expanded by nearly 40% over the past 20 years
CN111538801A (en) Method and system for evaluating geometric accuracy of linear elements in multi-source vector space data
Dmowska et al. Quantification and visualization of US racial geography using the National Racial Geography Dataset 2020
CN110347760B (en) Data analysis method for lost crowd space-time positioning service
CN118350788B (en) Territorial space planning processing system based on GIS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190416

RJ01 Rejection of invention patent application after publication