CN114153890A - Marine environment noise and hydrological meteorological association relation mining method based on Apriori - Google Patents
Marine environment noise and hydrological meteorological association relation mining method based on Apriori Download PDFInfo
- Publication number
- CN114153890A CN114153890A CN202010977512.9A CN202010977512A CN114153890A CN 114153890 A CN114153890 A CN 114153890A CN 202010977512 A CN202010977512 A CN 202010977512A CN 114153890 A CN114153890 A CN 114153890A
- Authority
- CN
- China
- Prior art keywords
- marine environment
- noise
- data
- marine
- environment noise
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000005065 mining Methods 0.000 title claims abstract description 17
- 238000001228 spectrum Methods 0.000 claims abstract description 18
- 238000004458 analytical method Methods 0.000 claims abstract description 3
- 239000011159 matrix material Substances 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 2
- 230000007613 environmental effect Effects 0.000 abstract description 19
- 230000003595 spectral effect Effects 0.000 abstract description 7
- 238000007418 data mining Methods 0.000 abstract description 4
- 238000012512 characterization method Methods 0.000 abstract description 3
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000010219 correlation analysis Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 241001284657 Stomatepia pindu Species 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Abstract
An Apriori-based marine environment noise and hydrometeorology association relationship mining method belongs to the field of data mining of marine environment noise and hydrometeorology. The method is characterized by firstly acquiring marine environmental noise and hydrological meteorological data and preprocessing the marine environmental noise and the hydrological meteorological data, and carrying out FFT analysis on the preprocessed marine environmental noise to obtain an environmental noise spectrum level. And then, carrying out graded characterization on the environmental noise spectrum level, the wind speed and the temperature to obtain a marine environment transaction set T. After discretizing the marine environment transaction set T, mining association rules by adopting an Apriori algorithm, and discovering a strong association relation between marine environment noise and hydrological weather. The invention has the advantages that: the correlation between the marine environment noise and the hydrological weather is mined through an Apriori algorithm, the influence of a certain hydrological weather element on the spectral level change of the marine environment noise can be mined according to a specific purpose, the confidence coefficient of the hydrological weather element influencing the spectral level of the marine environment noise in a specific frequency band is given, and the accuracy of the prediction of the spectral level of the marine environment noise is improved.
Description
Technical Field
The invention belongs to the field of data mining of marine environment noise and hydrometeorological data, and relates to a method for mining correlation relations between marine environment noise and hydrometeorological data such as wind speed and temperature based on Apriori, which can be directly applied to marine environment noise pollution control.
Background
The noise pollution of the ocean environment is huge and difficult to control, and is increasingly paid more attention by various countries and organizations. The influencing factors mainly comprise sailing, wind, rainfall, hydrology, marine life and the like. These factors are the main sources of marine environmental noise in different frequency bands, and the difference in the spatial-temporal distribution of the noise sources causes the environmental noise levels to vary greatly. Noise pollution of marine environments has long been ignored because it is invisible, but it has had its serious effects on marine life activities and human underwater acoustic communications. Therefore, it is necessary to systematically develop a research on a mining method of correlation between marine environmental noise and hydrometeorology such as wind speed and temperature for a typical sea area, and provide a support for controlling marine environmental noise pollution.
In the past, the correlation between the marine environmental noise and the hydrometeorology is mostly researched, and the influence frequency band and the influence range of the hydrometeorology on the marine environmental noise are obtained by combining with measured data and using conventional statistics, correlation analysis and other methods on the basis of researching the influence mechanism of different hydrometeorology elements (such as wind, water temperature and the like) on the marine environmental noise. The method needs strong domain knowledge, influences of other factors are not easy to eliminate when influences of a single hydrological meteorological element on marine environment noise are researched, in addition, only a rough range of influences of the hydrological meteorological element on the marine environment noise spectrum level of a specific frequency band can be given by conventional statistics, correlation analysis and other methods, and accuracy of prediction of the marine environment noise spectrum level is influenced.
Data mining is a technique for extracting potentially valuable knowledge hidden in massive, random, uncertain data, wherein association rule mining is one of the main contents of data mining, and the purpose of association rule mining is to discover associations between data items. The Apriori algorithm is a classical algorithm in association rule mining, and the basic idea is to use a layer-by-layer iterative search method to find a frequent set according to the minimum support degree set by a user and then generate an association rule from the frequent set. The Apriori algorithm is simple in thought, easy to implement and capable of being well used for discovering knowledge hidden in a large amount of marine environment data.
The core problem to be solved by the invention is how to mine the correlation between marine environment noise and hydrological weather based on Apriori algorithm. According to the method, a frequent item set with the support degree larger than the minimum support degree set by a user is found out based on an Apriori algorithm by utilizing a large amount of acquired marine environmental noise and hydrological meteorological data, then an association rule with the confidence degree larger than the minimum confidence degree set by the user is found from the frequent item set, and finally, the association relation between the marine environmental noise and the hydrological meteorological elements is excavated.
Disclosure of Invention
The invention aims to excavate the incidence relation influencing the spectral level change of the marine environment noise based on the Apriori mining method of the incidence relation between the marine environment noise and the hydrometeorology so as to solve the problems in the background technology.
The technical scheme of the invention is as follows:
1. continuously acquiring a large amount of marine environment noise data, synchronously acquiring hydrological meteorological data such as wind speed and temperature, cutting the acquired marine environment noise time domain signal into a plurality of sections according to a certain time rule, carrying out FFT analysis on the preprocessed marine environment noise signal, wherein the frequency resolution is 1Hz, and obtaining the marine environment noise spectrum level through calculation.
2. The method comprises the steps of utilizing a fuzzy C-means clustering method to carry out hierarchical representation on the spectrum level of the noise of the marine environment, the wind speed and the temperature data, namely dividing the data into a plurality of intervals, uniformly replacing the data falling in a certain interval with a specific symbol, for example, the wind speed is between 0.1 and 3.3m/s, dividing the data into 3 intervals, and respectively representing the 3 intervals by FS _1 (wind speed _ low), FS _2 (wind speed _ medium) and FS _3 (wind speed _ high). And then obtaining a marine environment transaction set T according to the hierarchical characterization result of the marine environment data (any record in the marine environment database is called as a transaction).
3. And (3) mapping the marine environment transaction set T obtained in the step (2), wherein the mapping method is to see whether each marine environment transaction item contains elements in the item set, if the marine environment transaction item contains the corresponding elements, the marine environment transaction item is marked as 1, and if the marine environment transaction item does not contain the corresponding elements, the marine environment transaction item is 0, so that a 0-1 matrix consisting of all marine environment data items can be obtained.
4. And (3) mining the association relation of the 0-1 matrix of the marine environment data obtained in the step (3) based on an Apriori algorithm, and excavating the association rule of the marine environment noise and the hydrological weather.
5. And screening and analyzing the obtained association rule according to the minimum support degree and the minimum confidence coefficient threshold value set by the user to find out the strong association relation between the marine environment noise and the hydrological weather.
The invention has the advantages that: according to the method, continuously observed marine environment data are utilized, the correlation relationship between marine environment noise and the hydrological weather is mined through an Apriori algorithm, the influence of a certain hydrological weather element on the change of the spectral level of the marine environment noise can be mined according to a specific purpose, the confidence coefficient of the influence of the hydrological weather element on the spectral level of the marine environment noise in a specific frequency band is given, and the accuracy of the prediction of the spectral level of the marine environment noise is improved.
Drawings
FIG. 1 is a flow chart of the correlation between the noise of the marine environment and the hydrometeorology.
FIG. 2 shows a schematic diagram of a frequent itemset in a marine environment transaction set.
Detailed Description
The following describes the embodiments of the present invention in detail with reference to the technical solutions and the attached fig. 1.
1. Continuously acquiring a large amount of marine environment noise signals, synchronously acquiring hydrological meteorological data such as wind speed and temperature, preprocessing the marine environment noise signals, and eliminating the noise signals with obvious interference. And then, cutting the preprocessed marine environment noise time domain signal into a plurality of sections according to a certain time rule, performing windowing sliding FFT (fast Fourier transform) processing on each section of data, and obtaining a marine environment noise power spectrum as the marine environment noise spectrum level of the time section through calculation. Let 1 st data at tiTemporal marine environmental noise signal pl(ti) To p forl(ti) Performing Fourier transform with the formula
Plk=Pl(fk)=FFT(pl(ti)) (1)
Wherein f isk=kfs/N,k=1,2,…,N,fsFor sampling frequency, N for each segment is the number of data points.
2. By utilizing the ocean environment noise spectrum level, the wind speed and the temperature data set and using a fuzzy C-means clustering method to respectively carry out hierarchical representation on the data, the specific method is that the data are divided into 3 intervals, the data falling into a certain interval are uniformly replaced by a specific symbol, the marine environment noise spectrum level is described as three levels PINPU _1 (spectrum _ low), PINPU _2 (spectrum _ in), PINPU _3 (spectrum _ high), the wind speed is described as three levels FS _1 (wind speed _ low), FS _2 (wind speed _ in), FS _3 (wind speed _ high), the temperature is described as three levels WENDU _1 (temperature _ low), WENDU _2 (temperature _ in), WENDU _3 (temperature _ high), the maximum wind speed is described as three levels ZDFS _1 (maximum wind speed _ low), ZDFS _2 (maximum wind speed _ in), ZDFS _3 (maximum wind speed _ high). And obtaining a transaction list in the marine environment database (any record in the marine environment database is called as a transaction) according to the hierarchical characterization result of the marine environment data, wherein each transaction is represented by T as shown in the table 1.
TABLE 1 Marine Environment transaction set List representation
Transaction sequence number | Marine Environment transaction item List |
T1 | FS_1,ZDFS_2,FX_1,WENDU_1,PINPU_1 |
T2 | FS_1,ZDFS_1,FX_1,WENDU_1,PINPU_1 |
T3 | FS_1,ZDFS_1,FX_1,WENDU_1,PINPU_1 |
T4 | FS_1,ZDFS_1,FX_1,WENDU_1,PINPU_1 |
T5 | FS_2,ZDFS_2,FX_1,WENDU_1,PINPU_1 |
T6 | FS_1,ZDFS_2,FX_1,WENDU_1,PINPU_1 |
T7 | FS_1,ZDFS_1,FX_2,WENDU_3,PINPU_1 |
T8 | FS_2,ZDFS_2,FX_2,WENDU_1,PINPU_1 |
T9 | FS_1,ZDFS_2,FX_1,WENDU_1,PINPU_1 |
... | ... |
T72 | FS_3,ZDFS_3,FX_1,WENDU_3,PINPU_1 |
... | ... |
3. The discretization of the data is an important link before the association rule mining, if the value in the data set is continuous, the discretization is carried out before the mining, the marine environment data is mainly continuous data, the discretization of the data is needed, firstly, the transaction set T in the marine environment database is mapped, whether each marine environment transaction contains elements in the item set or not is judged, if the marine environment transaction contains the corresponding elements, the corresponding elements are marked as 1, otherwise, the corresponding elements are 0, and therefore a 0-1 matrix consisting of all marine environment transaction items can be obtained.
4. According to the minimum support degree and the minimum confidence threshold set by the user, mining the association relationship between the dispersed marine environment noise spectrum level and the hydrometeorological data by using an Apriori association rule mining method, as shown in fig. 2, the specific flow is as follows:
(1) and setting a minimum support degree min _ sup to 0.2 and a minimum confidence degree min _ conf to 0.5, wherein the support degree and the confidence degree are mainly used for investigating the degree of the correlation between the marine environmental noise and the hydrological weather. The formula for calculating the support degree is as follows:
equation (2) represents the probability that dataset a occurs in the set of marine environment transactions. The confidence coefficient is calculated by the formula:
equation (3) represents the conditional probability of the occurrence of data set B under the condition of the occurrence of data set a. Where A is { a ═ a1,a2,a3,...,an},B={b1,b2,b3,...,bnAnd the element set of the marine environmental noise and the hydrological weather.
(2) Scanning all marine environment and hydrometeorology data, generating a candidate 1 item set C1 and obtaining a count for each candidate item, deleting FX _3 and WENDU _3 items in the candidate set C1 due to the minimum support being set to 0.2, a frequent 1 item set L1 can be determined, which is composed of candidate 1 item sets equal to or greater than the minimum support.
(3) To find the set of frequent 2-item sets, L2, the algorithm uses the elements of L1 to combine with each other two by two to produce the set of candidate 2-item sets, C2, and then gets L2 based on the minimum support.
(4) The elements of L2 are combined with each other to produce a set C3 of candidate 3 item sets, which is then L3 according to the minimum support.
(5) The algorithm uses the elements of L3 to combine with each other to produce a set of candidate 4-term sets C4. L3 ∞ L3 { { FS _2, ZDFS _2, WENDU _1, PINPU _3} }, all this set of entries is deleted, according to the nature of Apriori's algorithm, because its subset { ZDFS _2, WENDU _1, pindu _3} is not frequent. Thus C4 is an empty set, so the algorithm terminates and finds all the frequent item sets. For frequent set L3, all non-empty subsets generated are { FS _2}, { ZDFS _2}, { WENDU _1}, { PINPU _3}, { FS _2, ZDFS _2}, { FS _2, WENDU _1}, { FS _2, PINPU _3}, { ZDFS _2, WENDU _1}, { ZDFS _2, PINPU _3}, { WENDU _1, PINPU _3 }.
(6) Calculating the confidence coefficient of each non-empty subset according to the frequent item set of the marine environment data obtained in the step (5), and finding out the association rule meeting the minimum confidence coefficient threshold value, namelyHere, a is the rule front piece and B is the rule back piece, and the strong association rule and its confidence are obtained, as shown in table 2.
TABLE 2 Association rules and confidence of noise in marine environment and hydrometeorology
Rule front piece | Regular back-piece | Confidence level |
{FS_2,PINPU_3} | ZDFS_2 | 94.11% |
{FS_2,WENDU_1} | ZDFS_2 | 93.75% |
{ZDFS_2,PINPU_3} | FS_2 | 88.89% |
{ZDFS_2,WENDU_1} | FS_2 | 75% |
{FS_2,ZDFS_3} | PINPU_3 | 72.73% |
{FS_2,ZDFS_3} | WENDU_1 | 68.18% |
FS_2 | {ZDFS_2,PINPU_3} | 66.67% |
FS_2 | {ZDFS_2,WENDU_1} | 62.5% |
ZDFS_2 | {FS_2,PINPU_3} | 50% |
According to the results of the association rules in table 2, FS _2 and ZDFS _3 have a large influence on PINPU _3, that is, the wind speed is between 1.5 and 2.1 (FS _2), the maximum wind speed is between 2.9 and 4.9 (ZDFS _3), the spectrum level of the marine environmental noise is in the range of 71.6117 to 76.0281, that is, the probability of the PINPU _3 level occurring is 72.73%, and the above threshold can predict the spectrum level of the marine environmental noise through the variation range of the wind speed.
Claims (1)
1. An Apriori-based marine environment noise and hydrological meteorological association mining method is characterized by comprising the following steps:
(1) continuously acquiring a large amount of marine environment noise data, synchronously acquiring hydrological meteorological data such as wind speed, temperature and the like, cutting the acquired marine environment noise time domain signal into a plurality of sections according to a certain time rule, performing FFT analysis on the preprocessed marine environment noise signal, and calculating to obtain a marine environment noise spectrum level;
(2) carrying out hierarchical representation on the marine environment noise spectrum level, the wind speed and the temperature data by using a fuzzy C-means clustering method, namely dividing the data into a plurality of intervals, uniformly replacing the data falling in a certain interval with a specific symbol, and then obtaining a marine environment transaction set T according to the hierarchical representation result of the marine environment data;
(3) mapping the marine environment transaction set T obtained in the step 2, wherein the mapping method comprises the steps of judging whether each marine environment transaction item contains elements in the item set, if so, marking the marine environment transaction item as 1, and if not, marking the marine environment transaction item as 0, so that a 0-1 matrix consisting of all marine environment data items can be obtained;
(4) mining the association relation of the 0-1 matrix of the marine environment data obtained in the step 3 based on an Apriori algorithm, and excavating the association rule of the marine environment noise and the hydrological weather;
(5) and screening and analyzing the obtained association rule according to the minimum support degree and the minimum confidence coefficient threshold value set by the user to find out the strong association relation between the marine environment noise and the hydrological weather.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010977512.9A CN114153890A (en) | 2020-09-07 | 2020-09-07 | Marine environment noise and hydrological meteorological association relation mining method based on Apriori |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010977512.9A CN114153890A (en) | 2020-09-07 | 2020-09-07 | Marine environment noise and hydrological meteorological association relation mining method based on Apriori |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114153890A true CN114153890A (en) | 2022-03-08 |
Family
ID=80462215
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010977512.9A Pending CN114153890A (en) | 2020-09-07 | 2020-09-07 | Marine environment noise and hydrological meteorological association relation mining method based on Apriori |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114153890A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023202008A1 (en) * | 2022-04-19 | 2023-10-26 | 中国科学院声学研究所 | Marine environment noise forecasting method, computer device, and storage medium |
-
2020
- 2020-09-07 CN CN202010977512.9A patent/CN114153890A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023202008A1 (en) * | 2022-04-19 | 2023-10-26 | 中国科学院声学研究所 | Marine environment noise forecasting method, computer device, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Naghibi et al. | GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran | |
Xu et al. | In situ SST quality monitor (iQuam) | |
Sang et al. | The relation between periods’ identification and noises in hydrologic series data | |
Yang et al. | An integrative hierarchical stepwise sampling strategy for spatial sampling and its application in digital soil mapping | |
Butler et al. | Extreme value analysis of decadal variations in storm surge elevations | |
Alexander et al. | Development of hybrid wavelet-ANN model for hourly flood stage forecasting | |
Dickinson et al. | Seasonality of climatic drivers of flood variability in the conterminous United States | |
Zhu et al. | Loess terrain segmentation from digital elevation models based on the region growth method | |
Snelder et al. | Can bottom-up procedures improve the performance of stream classifications? | |
Pepler et al. | Independently assessing the representation of midlatitude cyclones in high‐resolution reanalyses using satellite observed winds | |
Duda et al. | Large-sample application of radar reflectivity object-based verification to evaluate HRRR warm-season forecasts | |
Beaugrand et al. | An overview of statistical methods applied to CPR data | |
CN114153890A (en) | Marine environment noise and hydrological meteorological association relation mining method based on Apriori | |
Şen et al. | Point cumulative semivariogram of areal precipitation in mountainous regions | |
Fan et al. | Comparison of earthquake-induced shallow landslide susceptibility assessment based on two-category LR and KDE-MLR | |
Prabhakaran et al. | Investigating spatial heterogeneity within fracture networks using hierarchical clustering and graph distance metrics | |
Li et al. | Revisiting the definition of rapid intensification of tropical cyclones by clustering the initial intensity and inner‐core size | |
Gourley et al. | Comments on “Flash flood verification: Pondering precipitation proxies” | |
CN116595290A (en) | Method for identifying key factors affecting chlorophyll change of marine physical elements | |
Mayer et al. | Subseasonal forecasts of opportunity identified by an interpretable neural network | |
Jacques | Describing and comparing variability of fish and macrozooplankton density at marine hydrokinetic energy sites | |
Gupta et al. | Characterizing the tail behaviour of daily precipitation probability distributions over India using the obesity index | |
Tadivaka et al. | Detection of ionospheric scintillation effects using LMD–DFA | |
Díaz et al. | Hierarchical classification of snowmelt episodes in the Pyrenees using seismic data | |
Suman et al. | Unveiling the climatic origin of streamflow persistence through multifractal analysis of hydro-meteorological datasets of India |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20220308 |