CN108446802A

CN108446802A - A kind of red tide prewarning method based on graph model structure

Info

Publication number: CN108446802A
Application number: CN201810243914.9A
Authority: CN
Inventors: 黄冬梅; 赵丹枫; 张烨宜; 林俊辰; 李亿红
Original assignee: Shanghai Maritime University
Current assignee: Shanghai Maritime University
Priority date: 2018-03-23
Filing date: 2018-03-23
Publication date: 2018-08-24
Anticipated expiration: 2038-03-23
Also published as: CN108446802B

Abstract

The present invention relates to a kind of red tide prewarning methods based on graph model structure, the described method comprises the following steps：Step S1, data prediction；Step S11, weight calculation；Step S12, environmental factor dimensionality reduction；Step S2 executes DWFCM clustering algorithms；Step S21, cluster centre selection improve；Step S22, Euclidean distance weighting；Step S23, object function improve；Step S3 builds red tide diagram data model.Its advantage is shown：Red tide data can be classified by stage of development, and is stored in graph model by the stage；Red tide forecast personnel can quickly judge the stage residing for red tide, and targetedly propose control measure for the current generation, reduce economy and Ecological Loss；Facilitate that red tide forecast personnel are accurate, quick search is to data are needed, provides the support in quick and precisely comprehensive data for red tide forecast personnel, carry out comparison control using present data and historical data, predict the stage of development of red tide.

Description

A kind of red tide prewarning method based on graph model structure

Technical field

The present invention relates to red tide datagram model construction techniques fields, specifically, being a kind of to be built based on graph model Red tide prewarning method.

Background technology

One of an important factor for red tide is influence marine environment, global Disaster And Prevention Measures of Red Tides takes place frequently, and endangers to ecology and economic system Evil is very big.The marine site of China in 2003 finds red tide 119 times altogether, about 14550 square kilometres of cumulative area, caused by direct economy damage Lose up to 42,810,000 yuan.In recent years, with the nearly development of coastal industrial or agricultural and the increase of population, industrial wastewater and urban life are dirty Water be continually drained it is marine, cause seawater part nutritional profile increase, promote red tide plankton to increase significantly, lead to frequent occurrence, damage It loses increasingly severe.

There is correlation technique to be studied in the prior art, has included mainly that red tide forecast method for early warning and graph model are built Method.

1. red tide forecast method for early warning

Existing correlation technique can study red tide monitoring data at present, and following two categories can be divided by integrating： Traditional red tide data classification method and machine learning classification method.Traditional sorting technique is divided into empirical method, statistic law sum number Value method.Such methods model is simple, to a certain extent can be with forecasting and warning red tide, but accuracy and precision are not high enough, nothing Method is applied to large-scale breakout of red tide.

In recent years, with the rise of artificial intelligence, carrying out research to red tide using the correlation technique of machine learning becomes master Stream.

Wang Xing is strong et al., and the COSA algorithms that would be used for text cluster are combined with FCM Algorithms and introduce similar pass System's pretreatment, is being improved, and is applied in red tide monitoring field, is achieved preferable accuracy rate and real-time

Zhang Chenghui et al. is directed to the deficiency of tradition FCM algorithms, it is proposed that a kind of improved FCM algorithms using sample and gather Class center similarity relation determines influence coefficient of each sample to cluster centre so that cluster process is rapider, and cluster result is more Add and stablizes accurately.

Nitin Muttil profits et al. model coastal red tide using genetic algorithm, achieve good effect.

Su Xinhong et al. establishes itself and temperature, precipitation, wind speed, air pressure and day using BP neural network artificial intelligence model Divide according to geographical location with corresponding meteorological index according to the non-linear relation of 5 meteorological factors, and by these red tide case datas Other input model is learnt, trained and is predicted, new approach is provided for the forecast of red tide.

Above method plays the forecasting and warning of red tide certain effect, but still in place of Shortcomings.First, The generation of red tide is influenced by various environmental factors, is difficult to differentiate between each stage, has ambiguity.Secondly, red tide monitoring number After classification, it is not stored effectively, forecast personnel use can not be supplied directly to.

2. graph model construction method

Nowadays diagram data model is widely used in various fields, such as community network, GIS-Geographic Information System, bioinformatics Deng.

Ou Xiao equality people propose music data model GraMM and query language GraMQL based on figure, by musical database It is modeled as individual big figure, also song can show as a vertex on figure to a first sound, and similarity between song turns over the relationship of singing Etc. the side that can be shown as between music track vertex, the inquiry operation of music data is to search to meet given meaning on big figure The subgraph of entry part.

Tang Dequan^[Et al. propose based on the crime law study of graphical data mining algorithm and its application, utilize vertex representation Criminal, node of graph information expression case is other, if crime personnel are in selection motivation, selection place, selecting object, crime means As soon as having identical on equal main informations, then there is a line connection naturally, indicate that the two crime personnel connected there may be connection System or clique, then excavate frequent subgraph, these Frequent tree minings are to finding out criminal activity rule and Safety actuality on the diagram Development trend provides effective decision-making foundation.

Wu Ye et al. proposes a kind of multi-source geographical spatial data relational model MSGCM towards efficient retrieval, passes through extraction Multi-source geographical spatial data spatial information, semantic description information, content description information and its incidence relation, construction feature element Figure.And multi-source geospatial object is fused in uniform spaces based on association mode, by calculating the pass between different objects Join intensity, build the correlation model of similar figure, provides based on keyword, is based on geographical location and object-based three kinds of issuers Method.

Above graph model construction method only for respective FIELD Data, is not applied for all data.Red tide monitoring Data have the characteristics that：First, red tide is interactional by various impact factors as a result, having between each impact factor Complicated relationship is suitble to store it using graph model；Secondly, the impact factor of red tide is numerous, and data have higher-dimension； Finally, it in order to give red tide forecast early warning personnel to provide quickly accurate aid decision, needs to deposit red tide monitoring data by the stage It puts.These features, which result in storage red tide monitoring data, cannot directly apply mechanically other storage method, it is necessary to which development is a kind of for red The graph model of damp monitoring data feature.

In order to reduce the influence that red tide is brought, the associated mechanisms of countries in the world all establish red tide in the neighbour marine site of oneself Monitoring system carrys out forecasting and warning Disaster And Prevention Measures of Red Tides.Most monitoring system monitoring region is wide, and monitoring parameters are numerous, brings a large amount of Red tide monitoring historical data.These historical datas can provide auxiliary and reference for the forecast of red tide forecast personnel.Therefore face To these historical datas, how for it effective storage model is built, improve the inquiry accuracy and speed of red tide forecast personnel, precisely Forecast the stage residing for red tide (initial period, developing stage, maintenance stage and extinction stage), and special for the biology in each stage Sign takes corresponding control measure, is only the key for successfully studying red tide.

The method of currently used storage red tide monitoring data is deposited according to the difference in time, monitoring region or monitoring means Storage is in relational database.This method table stable structure, can keep the consistency of data.But use relation data inventory Storage red tide monitoring data have the following problems：First, a red tide be it is coefficient by a variety of impact factors as a result, Relational database can only record data, can not establish between each impact factor, the relationship between each red tide stage of development；Its Secondary, continuing to monitor for many years causes red tide monitoring data volume huge, and the plenty of time will be consumed when searching related data.

Chinese patent literature CN201010242351.5, the applying date 20100802, patent name is：A kind of easy red tide Method for early warning includes the following steps：Following sea area illumination variation situation in a short time is first investigated, if its suitable or convenient sea area is red Damp biological growth breeding, then monitor the cell quantity of red tide plankton in the presence of sea area；In these cases, if having one in sea area Algal bloom biological cell quantity has reached its certain population advantage, then needs to judge that the suspicious red tide plankton gives birth to live water temperature again Response situation is managed, the risk class that red tide occurs for sea area is then made；If being not present in sea area or thin without a kind of red tide plankton Born of the same parents' quantity reaches its certain population advantage, there is no need to monitor water temperature again, can directly make the risk etc. that red tide occurs for sea area Grade；If illumination variation situation is not suitable for red tide plankton growth in a short time in the following sea area, thin there is no need to monitor sea area red tide plankton Born of the same parents' quantity and live water temperature, can directly make the risk class that red tide occurs for sea area.

The red tide prewarning method of above patent document, by the cell quantity of red tide plankton present in sea area, suspicious red Damp biology judges sea area to three factors such as live water temperature physiological reaction and the possible situation of change of following sea area illumination in a short time The risk class that red tide occurs in the recent period, to reach red tide prewarning purpose.The red tide prewarning method is suitable for each local sea area scene Monitoring, the required factor judged of entire method is few and clear, and required data are obtained without need for large-scale instrument and equipment, and whole A method does not need complicated operation and judgement yet.In terms of early warning effect, the red tide prewarning method sensitivity is higher, is judged Early warning result reliability it is high, if can be promoted in each sea area and conscientiously applied to each She Hai economic entities, and producing Corresponding red tide prewarning coping mechanism is established in journey, will greatly enhance its resisting risk ability and the market competitiveness.But it closes Data can be classified in one kind, and sorted data are stored in graph model, to facilitate looking into for red tide data It looks for, achievees the purpose that quick Exact Forecast, more natural the data with complicated incidence relation can be stored.But it closes Red tide data can be classified in one kind by stage of development, and is stored in graph model by the stage；Red tide forecast personnel can Quickly to judge the stage residing for red tide, and control measure is targetedly proposed for the current generation, reduces economic and ecology damage It loses；Facilitate that red tide forecast personnel are accurate, quick search is to data are needed, is provided for red tide forecast personnel quick and precisely comprehensively several According to upper support, comparison control is carried out using present data and historical data, predicts the technical solution of the stage of development of red tide Then without corresponding open.

In conclusion needing one kind that red tide data can be classified by stage of development, and artwork is stored in by the stage In type；Red tide forecast personnel can quickly judge the stage residing for red tide, and targetedly propose that prevention is arranged for the current generation It applies, reduces economy and Ecological Loss；Facilitate that red tide forecast personnel are accurate, quick search is to data are needed, for red tide forecast personnel Support in quick and precisely comprehensive data is provided, comparison control is carried out using present data and historical data, predicts red tide Stage of development the red tide prewarning method based on graph model structure, and yet there are no report about this method for early warning.

Invention content

The purpose of the present invention is being directed to deficiency in the prior art, provide one kind can by red tide data by stage of development into Row classification, and be stored in graph model by the stage；Red tide forecast personnel can quickly judge the stage residing for red tide, and for current Stage targetedly proposes control measure, reduces economy and Ecological Loss；Facilitate that red tide forecast personnel are accurate, quick search arrives Data are needed, the support in quick and precisely comprehensive data is provided for red tide forecast personnel, uses present data and history number According to comparison control is carried out, the red tide prewarning method of the stage of development of red tide built based on graph model is predicted.

To achieve the above object, the technical solution adopted by the present invention is that：

A kind of red tide prewarning method based on graph model structure, the described method comprises the following steps：

Step S1, data prediction；

Step S11, weight calculation；

Step S12, environmental factor dimensionality reduction；

Step S2 executes DWFCM clustering algorithms；

Step S21, cluster centre selection improve；

Step S22, Euclidean distance weighting；

Step S23, object function improve；

Step S3 builds red tide diagram data model.

As a kind of perferred technical scheme, in step S1, sample data is normalized, data is made to be mapped to section Between [0,1], normalized function is as follows：

Wherein, min (x_i) and max (x_i) be respectively sample data minimum value and maximum value.

As a kind of perferred technical scheme, the weight calculation in step S11 is specific as follows：

Assuming that input sample matrix X, shares m sample, n environmental factor, expression formula is：

Each sample is to the contribution degree of jth (1≤j≤n) a environmental factor：

Whole i sample datas are expressed as E to the contribution total amount of environmental factor j_j：

Wherein, constantThis ensures that 0≤E_j≤ 1, i.e. E_jIt is up to 1, when each sample number under some attribute According to contribution degree reach unanimity when, E_jTend to 1；When the sample data under the attribute is all equal, do not consider the attribute certainly Effect in plan, the weight of the attribute is zero at this time.

As a kind of perferred technical scheme, occurred in a red tide by calculating each environmental factor in step S12 Feature power kind in journey, dimensionality reduction is carried out to the high-dimensional environment factor, and by ranking, the corresponding sample selection of environmental factor of the first two comes out, Subsequent red tide monitoring data are carried out using the sample after dimensionality reduction to classify.

As a kind of perferred technical scheme, the cluster centre selection in step S12, which improves, is specially：

Step S121, if data set X={ x_i, i=1 ..., n } and it is sample set, the minimum range threshold between each class is set Value a；

Step S122, input：Data set X, output：C cluster centre；

Step S123 calculates the Euclidean distance between any two sample, generates Distance matrix D, will be apart from nearest two A data sample is set to one kind, and takes the midpoint of two samples as first kind cluster centre

Step S123 sets minimum threshold of distance a, is searched and two sample distances in the first kind using Distance matrix D It is all higher than the sample of a, and two nearest samples of selection Weighted distance are set to one kind in these samples, and takes two samples Midpoint is as the second class cluster centre；

Step S124 is similarly looked in remaining sample and is both greater than the sample of a with the sample distance found, and select It is a kind of to select the shortest two samples positioning of distance in these samples, and takes the terminal of two samples as its cluster centre；

Step S125, repeat step S124, until finding c cluster centre, if in sample be not present distance a with Interior sample can suitably reduce α.

As a kind of perferred technical scheme, a weight, weight and Europe are assigned by each environmental factor in step S22 Formula distance combines, and obtains the weighted euclidean distance based on weight, and each sample is made to have more tendentious degree of membership, specific side Method is as follows：

Step S221 defines related coefficient T_ij, sample x_iAnd x_jRelated coefficient, i=1,2 ..., n, j=1,2 ..., m, And T_ij=T_ft；

Step S222 defines weighted euclidean distance d_ij, d² _ij=| | (x_i-x_j)+T_ij(y_i-y_j)||²

Step S223 defines weighted distance matrix D,It is that any two sample adds in X Weigh the matrix of Euclidean distance, wherein d_ijIt is sample x_iAnd x_jBetween weighted euclidean distance.

As a kind of perferred technical scheme, the target letter of distance between sample data and cluster centre is considered in step S3 Number is：

Wherein,d_ijIt is v_iAnd v_jBetween weighting it is European away from From；

Object function need to meet claimed below：

μ_ik∈ [0,1],

Using lagrange's method of multipliers to J_m(U, V, η) partial differential is sought in degree of membership and cluster centre and make etc. Formula is equal to zero, and the more new formula for obtaining degree of membership and cluster centre is：

As a kind of perferred technical scheme, the algorithm steps after Further aim function are as follows：

Step S231, input：Data set X, cluster centre number c, end condition ε, Fuzzy Weighting Exponent m, minimum weight Distance threshold a, iterations k=1；Output：C cluster

Step S232 calculates weighted distance matrix D, finds out c according to minimum weight distance threshold a and cluster centre number c A cluster centre；

Step S233 resets iterations k=1, using the result of step 232 as initial cluster center v_i, i=1, 2 ..., c；

Step S234, according to v_i(k) formula is pressedCalculate degree of membership μ_ij(k)；

Step S235, according to μ_ij(k) formula is pressedContinue to change For object function and cluster centre v_i(k+1)；

Step S236, if ‖ v_i(k+1)-v_iK ‖ >=ε, then return to step three, k=k+1；Otherwise, it end loop and obtains Cluster result.

As a kind of perferred technical scheme, following steps are specifically included in step S4：

Red tide data are fallen into 5 types, are respectively by step S41：Red tide stage, initial period, developing stage, dimension do not occur Hold stage and extinction stage；

Sorted red tide data are stored in graph model by step S42, with big figure and small figure structure point and side, on big figure O'clock be made of a small figure, indicate that the red tide monitoring data set in a stage, session information store in the label, share 5 Stage；While indicating the correlation degree between each stage red tide monitoring data；Point on small figure indicates a red tide data, Bian You The when and where structure incidence relation composition of red tide data；

Step S43 defines red tide data graph model, non-directed graph G=(V_G, E, M_v), V_G={ v_i, i=1,2 ..., 5 } and it is a little Set, red tide is shared | V | a stage, E={ e_ij, i, j=1,2 ..., | V |；i≠j；e_ij=e_jiBe side set, M_v= {s_cBe V label, indicate the stage of red tide, c=5；

Step S44 defines the degree of association between the red tide stage：Incidence relation between each stage is by formulaStructure, and be stored in the side E of big figure；

Step S44 defines each stage red tide datagram：V_G=(v, e, m_v), v={ v '_{I, i}=1,2 ..., n } it is current rank One red tide data of section, e={ e '_ijIt is v '_tWith v '_jBetween side, m_vThe label on v, indicate red tide data when Between, place and attribute information；

Step S42 defines the degree of association between red tide data：By when and where information architecture, and it is stored in small figure In the e of side.

The invention has the advantages that：

1, a kind of red tide prewarning method based on graph model structure of the invention, can by red tide data by stage of development into Row classification, and be stored in graph model by the stage；Red tide forecast personnel can quickly judge the stage residing for red tide, and for current Stage targetedly proposes control measure, reduces economy and Ecological Loss；Facilitate that red tide forecast personnel are accurate, quick search arrives Data are needed, the support in quick and precisely comprehensive data is provided for red tide forecast personnel, uses present data and history number According to comparison control is carried out, the stage of development of red tide is predicted.

2, assume known one group of red tide data, store it in the graph model that the present invention uses, can quickly judge to work as Stage residing for preceding red tide, and take appropriate measures；Red tide forecast personnel want inquiry some day, a certain marine site it is all red Damp historical data can go out accurately data with quick search.

3, the present invention is classified data using the DWFCM algorithms for red tide monitoring data, and will be sorted Data are stored in graph model, to facilitate the lookup of red tide data, achieve the purpose that quick Exact Forecast.

4, the present invention first pre-processes red tide monitoring data by impact factor weight calculation and dimensionality reduction, then By improving cluster centre selection, having obtained suitable red tide data classification to Euclidean distance weighted sum Further aim function The red tide monitoring data classified finally are stored in graph model, are established on side between each phase data by DWFCM algorithms Relationship.Experiment shows that sorting algorithm used in the present invention has higher operation when classifying to red tide monitoring data The use of speed and accuracy, graph model improves the efficiency of inquiry, and providing efficient auxiliary to the forecasting and warning of red tide determines Plan.

5, weighed kind by calculating the feature of each environmental factor in a red tide generating process, to the high-dimensional environment factor into Row dimensionality reduction, by ranking, the corresponding sample selection of environmental factor of the first two comes out, and subsequent red tide is carried out using the sample after dimensionality reduction Monitoring data are classified, and operation time is improved.

6, by selecting to improve to cluster centre so that the stabilization of object function iterations improves the operation speed of algorithm Degree.

7, the present invention is that each environmental factor assigns a weight, and weight and Euclidean distance are combined, obtained based on weight Weighted euclidean distance makes each sample have more tendentious degree of membership.

8, the improvement of the invention by object function, improved object function not only allow between sample and cluster centre Distance, it is also contemplated that interrelated factor between each cluster centre calculates the minimum between sample and cluster centre Maximum weighted Euclidean distance between weighted euclidean distance and each cluster centre so that inter- object distance is closer, and between class distance is got over Far, cluster result will be more accurate.

9, diagram data model is a data model for indicating and storing information by vertex, side and its attribute, it Flexibility more natural can store the data with complicated incidence relation, be provided to red tide forecast early warning personnel efficient Aid decision, provide a kind of new memory module for red tide monitoring data.

10, the present invention is classified data according to stage of development by algorithm, then by sorted each phase data As a vertex on big figure, the side between each phase data is built；Finally by each data structure on big figure vertex It builds as a vertex on small figure, the similarity relation on small figure side is built with time and geographical location, to realize using artwork Type stores red tide monitoring data.

Description of the drawings

Attached drawing 1 is a kind of overall framework schematic diagram of red tide prewarning method built based on graph model of the present invention.

Attached drawing 2 is red tide data graph model.

Specific implementation mode

It elaborates below in conjunction with the accompanying drawings to specific implementation mode provided by the invention.

Fig. 1 is please referred to, Fig. 1 is a kind of overall framework signal of red tide prewarning method built based on graph model of the present invention Figure.Red tide data are pre-processed first, it is contemplated that the generation of red tide is formed by various environmental factors collective effect, if jointly Bring calculating into, larger calculation amount will be led to, thus first calculate each environmental factor of red tide weight, and according to weight to environment because Son carries out dimensionality reduction, selects shared maximum two environmental factors of weight and is subsequently calculated.Secondly, by the cluster to FCM The heart, Euclidean distance and object function are improved, and obtain DWFCM algorithms, and red tide data are fallen into 5 types according to stage of development.Most Sorted red tide data are stored in graph model afterwards, structure point and side.

Step S1, data prediction；

Step S11, weight calculation；

Step S12, environmental factor dimensionality reduction；

Step S2 executes DWFCM clustering algorithms；

Step S21, cluster centre selection improve；

Step S22, Euclidean distance weighting；

Step S23, object function improve；

Step S3 builds red tide diagram data model.

1. data prediction

First, it is contemplated that sample data may have different data target and unit, and sample data is normalized, Data are made to be mapped between section [0,1].Normalized function is as follows：

1.1 weight calculation

Assuming that input sample matrix X, shares m sample, n environmental factor.

Wherein, constantThis ensures that 0≤Ej≤1, i.e. E_jIt is up to 1.

It can be seen that from formula when the contribution degree of each sample data under some attribute reaches unanimity, F_jTend to 1；Especially It is when the sample data under the attribute is all equal, so that it may not consider effect of the attribute in decision, i.e. the category at this time The weight of property is zero.

d_jIndicate the degree of consistency of each sample data contribution degree under j-th of environmental factor, d_j=1-E_j。

Each attribute weight is：

1.2 environmental factor dimensionality reductions

Red tide is formed by a variety of environmental factor collective effects, and red tide of each environmental factor pair has There is different influence degrees, classify if these environmental factors are all brought into, higher operation time will be caused.The present invention Kind is weighed by calculating feature of each environmental factor in a red tide generating process, dimensionality reduction is carried out to the high-dimensional environment factor, it will The ranking corresponding sample selection of environmental factor of the first two comes out, and subsequent red tide monitoring data point are carried out using the sample after dimensionality reduction Class.

2.DWFCM

The selection of 2.1 cluster centres improves

FCM algorithms, which are exactly one, makes object function J_mThe iterative solution process of minimum.In FCM algorithms, algorithm gathers Class effect suffers from the influence of initial cluster center, and the random selection of initial cluster center results in object function iterations It is unstable, and be easy so that the case where algorithmic statement is to local minimum, the calculating of a large amount of repeatedly Euclidean distances result in algorithm The speed of service it is low.In view of the above problems, the present invention first improves the selection of initial cluster center.

If X={ x_i, i=1 ..., n } and it is sample set, the minimum threshold of distance α between each class is set, is selected initial poly- The algorithm steps at class center are as follows.

Algorithm 1:

Input：Data set X

Output：C cluster centre

Step 1：The Euclidean distance between any two sample is calculated, Distance matrix D is generated.By two of distance recently Data sample is set to one kind, and takes the midpoint of two samples as first kind cluster centre；

Step 2：Minimum threshold of distance α is set, is searched with two samples distance in the first kind using Distance matrix D Sample more than α, and two nearest samples of selection Weighted distance are set to one kind in these samples, and take in two samples Point is used as the second class cluster centre；

Step 3：Similarly, it is looked in remaining sample and is both greater than the sample of α with the sample distance found, and selected The shortest two samples positioning of distance is a kind of in these samples, and takes the terminal of two samples as its cluster centre；

Step 4：Step 3 is repeated, until finding c cluster centre.If there is no distances within α in sample Sample can suitably reduce α.

2.2 Euclidean distances weight

In FCM algorithms, some sample x_kWhich kind of is more likely to belong to, needs to be judged according to degree of membership size, and is subordinate to It is in cluster process that the computational methods of category degree, which are by measuring the distance between sample and cluster, therefore according to range estimation ownership, Important method.In red tide monitoring data, there are part edge samples, that is, are between two stages, and degree of membership distribution is equal It is even, it can not directly judge which stage this sample belongs to by degree of membership.FCM algorithms based on traditional Euclidean distance, it is European away from It is similarly acted on from assuming that each attribute plays during cluster.But in red tide during actually occurring, Mei Gehuan There is different weighing factors, some environmental factors important work is played in cluster process for the generation of red tide of border factor pair With, and the effect of some environmental factors is secondary or can be ignored.In view of the above problems, the present invention is each environmental factor A weight is assigned, weight and Euclidean distance combine, and obtain the weighted euclidean distance based on weight, each sample is made to have more Tendentious degree of membership.

It calculates first in a red tide generating process, the weight that each Environmental Factors red tide occurs, and by its band Enter in Euclidean distance.

Define 1 correlation coefficient r_ij.Sample x_iAnd x_jRelated coefficient, i=1,2 ..., n, j=1,2 ..., m, and r_ij= r_ij。

Define 2：Weighted euclidean distance d_ij。d² _ij=| | (x_i-x_j)+r_ij(y_i-y_j)||²

Define 3：Weighted distance matrix D.Be any two sample in X weighting it is European The matrix of distance.Wherein d_ijIt is sample x_iAnd x_jBetween weighted euclidean distance.

3. object function improves

Original FCM algorithms only account for the distance between sample data and cluster centre, do not consider each cluster centre it Between distance, inter- object distance is closer, and between class distance is remoter, and cluster result will be more accurate.

The object function for considering distance between sample data and cluster centre is：

Wherein,d_ijIt is v_iAnd v_jBetween weighting it is European away from From.Object function need to meet claimed below：

Improved object function not only allows for the distance between sample and cluster centre, it is also contemplated that each cluster centre Between interrelated factor, calculate minimum weight Euclidean distance between sample and cluster centre and each cluster centre it Between maximum weighted Euclidean distance so that each class inner distance is nearest, and distance is farthest between class, to accomplish more accurately to divide Class.

Algorithm steps after Further aim function are as follows.

Algorithm 2:

Input：Data set X, cluster centre number c, end condition ε, Fuzzy Weighting Exponent m, minimum weight distance threshold α, Iterations k=1.

Output：C cluster.

Step 1：According to minimum weight distance threshold α and cluster centre number c, weighted distance matrix D is calculated, finds out c Cluster centre；

Step 2：Iterations k=1 is reset, using the result of step 1 as initial cluster center v_i, i=1,2 ..., c；

Step 3：According to v_i(k) it presses formula 10 and calculates degree of membership μ_ij(k)；

Step 4：According to μ_ij(k) it presses formula 9 and continues iterative target function and cluster centre v_i(k+1)；

Step 5：If ‖ v_i(k+1)-v_iK ‖ >=ε, then return to step three, k=k+1；Otherwise, end loop and gathered Class result.

4. red tide datagram model construction

By the operation that 1.2.3 is saved, red tide data are classified according to the stage of generation, red tide data quilt in the present invention It falls into 5 types, is respectively：Red tide stage, initial period, developing stage, maintenance stage do not occur and withers away the stage.

The red tide monitoring data with incidence relation are very suitable for storing it using graph model from each other.At this In invention, o'clock being made of a small figure on big figure indicates that the red tide monitoring data set in a stage, session information are stored in mark In label, 5 stages are shared in the present invention；While indicating the correlation degree between each stage red tide monitoring data.Point on small figure A red tide data are indicated, while being made of the when and where structure incidence relation of red tide data.The graph model of structure such as Fig. 2 institutes Show.

Define 4:Red tide data graph model：Non-directed graph G=(V_G, E, M_v)。V_G={ v_i, i=1,2 ..., 5 } and it is the collection put Close, red tide is shared | V | a stage, E={ e_ij, i, j=1,2 ..., | V |；i≠j；e_ij=e_jiBe side set, M_v={ s_cBe The label of v indicates the stage of red tide.In the present invention, c=5.

Define 5:The degree of association between the red tide stage：Incidence relation between each stage is built by formula 6, and is stored in In the side E of big figure.

Define 6:Each stage red tide datagram：V_G=(v, e, m_v), v={ v '_i, i=1,2 ..., n } and it is the one of the current generation Red tide data, e={ e '_ijIt is v '_iWith v '_jBetween side.m_vIt is the label on v, the time of one red tide data of expression, Point and attribute information.

Define 7:The degree of association between red tide data：By when and where information architecture, and it is stored in the side e of small figure.

The present invention is tested using the monitoring data at the Changjiang river port in May, -2014 in 2012, be respectively adopted FCM, PCM and the improved DWFCM algorithms of the present invention classify to data, are then store in the graph model that the present invention is built.

1 FCM, PCM and DWFCM classification results of table compare

By table 1 it is found that the DWFCM methods that use of the present invention can obtain higher standard by less iterations True rate carries out more accurate, quickly classification to red tide data.

It is just valuable that the data of storage are only applied to reality.Red tide data are stored in relational model and the present invention respectively In the graph model used, and use identical inquiry, the result of comparison query.

(1) position keyword query：Location information includes longitude and latitude, in actual data acquisition, due to Certain error and reality, physical location and site location may have deviation.In relational model, closed according to position Key word is possible to can not find related data；In graph model, due to establishing the incidence relation between each data, Ke Yigen in advance According to given position, the information of neighbouring position is inquired.

(2) time-critical word is inquired：Red tide data are a period of time sequences, by a determining time, in relational model In can only inquire a data.In graph model, all data in correlation time can be inquired, red tide forecast people is facilitated Member is completely forecast.

(3) environmental factor keyword query：In relational model, an environmental factor keyword can inquire it is all with The matched data of this environmental factor, this will expend very more time.Red tide is formed by a variety of environmental factor collective effects , a single environmental factor can not provide effective aid decision for red tide forecast personnel.It, can be with and in graph model According to incidence relation, inquire the data of Related Environmental Factors and according to the classification of red tide stage of development and query rate it is higher.

A kind of red tide prewarning method based on graph model structure of the present invention can be carried out red tide data by stage of development Classification, and be stored in graph model by the stage；Red tide forecast personnel can quickly judge the stage residing for red tide, and be directed to current rank Section targetedly proposes control measure, reduces economy and Ecological Loss；Facilitate that red tide forecast personnel are accurate, quick search to need Data are wanted, the support in quick and precisely comprehensive data is provided for red tide forecast personnel, uses present data and historical data Comparison control is carried out, predicts the stage of development of red tide；Assuming that known one group of red tide data, store it in the figure that the present invention uses In model, the stage residing for current red tide can be quickly judged, and take appropriate measures；Red tide forecast personnel want to inquire certain One day, all red tide historical datas in a certain marine site, can go out accurately data with quick search；Diagram data model is one and passes through Vertex, side and its attribute indicate and store the data model of information, its flexibility can be more natural to having complicated pass The data of connection relationship are stored, and efficient aid decision is provided to red tide forecast early warning personnel, provide a kind of new be directed to The memory module of red tide monitoring data；The present invention is classified data according to stage of development by algorithm, after then classifying Each phase data as a vertex on big figure, build the side between each phase data；It finally will be on big figure vertex Each data is configured to a vertex on small figure, and the similarity relation on small figure side is built with time and geographical location, to It realizes and stores red tide monitoring data using graph model；The present invention use for red tide monitoring data DWFCM algorithms to data into It has gone classification, and sorted data has been stored in graph model, to facilitate the lookup of red tide data, reached quick Exact Forecast Purpose；The present invention first pre-processes red tide monitoring data by impact factor weight calculation and dimensionality reduction, then logical It crosses and improves cluster centre selection, the DWFCM of suitable red tide data classification has been obtained to Euclidean distance weighted sum Further aim function The red tide monitoring data classified finally are stored in graph model, the relationship between each phase data are established on side by algorithm. Experiment show sorting algorithm used in the present invention when classifying to red tide monitoring data have the higher speed of service and The use of accuracy, graph model improves the efficiency of inquiry, and efficient aid decision is provided to the forecasting and warning of red tide.

The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art Member, under the premise of not departing from the method for the present invention, can also make several improvement and supplement, these are improved and supplement also should be regarded as Protection scope of the present invention.

Claims

1. a kind of red tide prewarning method based on graph model structure, which is characterized in that the described method comprises the following steps：

Step S1, data prediction；

Step S11, weight calculation；

Step S12, environmental factor dimensionality reduction；

Step S2 executes DWFCM clustering algorithms；

Step S21, cluster centre selection improve；

Step S22, Euclidean distance weighting；

Step S23, object function improve；

Step S3 builds red tide diagram data model.

2. the red tide prewarning method according to claim 1 based on graph model structure, which is characterized in that right in step S1 Sample data is normalized, and so that data is mapped between section [0,1], normalized function is as follows：

3. the red tide prewarning method according to claim 1 based on graph model structure, which is characterized in that in step S11 Weight calculation is specific as follows：

Wherein, constantThis ensures that 0≤E_j≤ 1, i.e. E_jIt is up to 1, when the tribute of each sample data under some attribute When degree of offering reaches unanimity, E_jTend to 1；When the sample data under the attribute is all equal, do not consider the attribute in decision Effect, the weight of the attribute is zero at this time.

4. the red tide prewarning method according to claim 1 based on graph model structure, which is characterized in that lead in step S12 The feature power kind for calculating each environmental factor in a red tide generating process is crossed, dimensionality reduction is carried out to the high-dimensional environment factor, will be arranged The name corresponding sample selection of environmental factor of the first two comes out, and subsequent red tide monitoring data point are carried out using the sample after dimensionality reduction Class.

5. the red tide prewarning method according to claim 1 based on graph model structure, which is characterized in that in step S12 Cluster centre selection, which improves, is specially：

Step S121, if data set X={ x_i, i=1 ..., n } and it is sample set, the minimum threshold of distance α between each class is set；

Step S122, input：Data set X, output：C cluster centre；

Step S123 calculates the Euclidean distance between any two sample, generates Distance matrix D, by two nearest numbers of distance It is set to one kind according to sample, and takes the midpoint of two samples as first kind cluster centre

Step S123 sets minimum threshold of distance α, is searched using Distance matrix D big with two samples distance in the first kind In the sample of α, and two nearest samples of selection Weighted distance are set to one kind in these samples, and take the midpoint of two samples As the second class cluster centre；

Step S124 is similarly looked in remaining sample and is both greater than the sample of α with the sample distance found, and select this The shortest two samples positioning of distance is a kind of in a little samples, and takes the terminal of two samples as its cluster centre；

Step S125 repeats step S124, until finding c cluster centre, if there is no distances within α in sample Sample can suitably reduce α.

6. the red tide prewarning method according to claim 1 based on graph model structure, which is characterized in that lead in step S22 It crosses each environmental factor and assigns a weight, weight and Euclidean distance combine, and obtain the weighted euclidean distance based on weight, make every A sample has more tendentious degree of membership, and the specific method is as follows：

Step S221 defines correlation coefficient r_ij, sample x_iAnd x_jRelated coefficient, i=1,2 ..., n, j=1,2 ..., m, and r_ij =r_ji；

Step S222 defines weighted euclidean distance d_ij, d² _ij=| | (x_i-x_j)+r_ij(y_i-y_j)||²

Step S223 defines weighted distance matrix D,It is the weighting Europe of any two sample in X The matrix of formula distance, wherein d_ijIt is sample x_iAnd x_jBetween weighted euclidean distance.

7. the red tide prewarning method according to claim 1 based on graph model structure, which is characterized in that consider in step S3 The object function of distance is between sample data and cluster centre：

Wherein,0≤β≤1, a, b ∈ c, d_ijIt is v_iAnd v_jBetween weighted euclidean distance；

Object function need to meet claimed below：

μ_ik∈ [0,1],

Using lagrange's method of multipliers to J_mThe equation etc. that (U, V, η) seeks partial differential and made in degree of membership and cluster centre In zero, the more new formula for obtaining degree of membership and cluster centre is：

8. the red tide prewarning method according to claim 7 based on graph model structure, which is characterized in that Further aim function Algorithm steps afterwards are as follows：

Step S231, input：Data set X, cluster centre number c, end condition ε, Fuzzy Weighting Exponent m, minimum weight distance Threshold alpha, iterations k=1；Output：C cluster

Step S232 calculates weighted distance matrix D according to minimum weight distance threshold α and cluster centre number c, finds out c and gathers Class center；

Step S233 resets iterations k=1, using the result of step 232 as initial cluster center v_i, i=1,2 ..., c；

Step S235, according to μ_ij(k) formula is pressedContinue iteration mesh Scalar functions and cluster centre v_i(k+1)；

Step S236, if | | v_i(k+1)-v_iK | | >=ε, then return to step three, k=k+1；Otherwise, end loop and gathered Class result.

9. the red tide prewarning method according to claim 7 based on graph model structure, which is characterized in that specific in step S4 Include the following steps：

Red tide data are fallen into 5 types, are respectively by step S41：The red tide stage does not occur, initial period, developing stage, maintains rank Section and extinction stage；

Sorted red tide data are stored in graph model by step S42, and point and side, the point on big figure are built with big figure and small figure It is made of a small figure, indicates that the red tide monitoring data set in a stage, session information store in the label, share 5 stages； While indicating the correlation degree between each stage red tide monitoring data；Point on small figure indicates a red tide data, while by red tide The when and where structure incidence relation composition of data；

Step S43 defines red tide data graph model, non-directed graph G=(V_G, E, M_v), V_G={ v_i, i=1,2 ..., 5 } and it is the collection put Close, red tide is shared | V | a stage, E={ e_ij, i, j=1,2 ..., | V |；i≠j；e_ij=e_jiBe side set, M_v={ s_cBe The label of V indicates the stage of red tide, c=5；

Step S44 defines each stage red tide datagram：V_G=(v, e, m_v), v={ v '_i, i=1,2 ..., n } and it is the current generation One red tide data, e={ e '_ijIt is v '_iWith v '_jBetween side, m_vIt is the label on v, the time of one red tide data of expression, Place and attribute information；

Step S42 defines the degree of association between red tide data：By when and where information architecture, and it is stored in the side e of small figure In.