CN114153890A - Marine environment noise and hydrological meteorological association relation mining method based on Apriori - Google Patents

Marine environment noise and hydrological meteorological association relation mining method based on Apriori Download PDF

Info

Publication number
CN114153890A
CN114153890A CN202010977512.9A CN202010977512A CN114153890A CN 114153890 A CN114153890 A CN 114153890A CN 202010977512 A CN202010977512 A CN 202010977512A CN 114153890 A CN114153890 A CN 114153890A
Authority
CN
China
Prior art keywords
marine environment
noise
data
marine
environment noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010977512.9A
Other languages
Chinese (zh)
Inventor
曹琳
彭圆
车树伟
张学刚
牟林
胡文帅
张钊辉
胡晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
760 RESEARCH INSTITUTE OF CSIC
China State Shipbuilding Corp Ltd
Original Assignee
760 RESEARCH INSTITUTE OF CSIC
China State Shipbuilding Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 760 RESEARCH INSTITUTE OF CSIC, China State Shipbuilding Corp Ltd filed Critical 760 RESEARCH INSTITUTE OF CSIC
Priority to CN202010977512.9A priority Critical patent/CN114153890A/en
Publication of CN114153890A publication Critical patent/CN114153890A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

An Apriori-based marine environment noise and hydrometeorology association relationship mining method belongs to the field of data mining of marine environment noise and hydrometeorology. The method is characterized by firstly acquiring marine environmental noise and hydrological meteorological data and preprocessing the marine environmental noise and the hydrological meteorological data, and carrying out FFT analysis on the preprocessed marine environmental noise to obtain an environmental noise spectrum level. And then, carrying out graded characterization on the environmental noise spectrum level, the wind speed and the temperature to obtain a marine environment transaction set T. After discretizing the marine environment transaction set T, mining association rules by adopting an Apriori algorithm, and discovering a strong association relation between marine environment noise and hydrological weather. The invention has the advantages that: the correlation between the marine environment noise and the hydrological weather is mined through an Apriori algorithm, the influence of a certain hydrological weather element on the spectral level change of the marine environment noise can be mined according to a specific purpose, the confidence coefficient of the hydrological weather element influencing the spectral level of the marine environment noise in a specific frequency band is given, and the accuracy of the prediction of the spectral level of the marine environment noise is improved.

Description

Marine environment noise and hydrological meteorological association relation mining method based on Apriori
Technical Field
The invention belongs to the field of data mining of marine environment noise and hydrometeorological data, and relates to a method for mining correlation relations between marine environment noise and hydrometeorological data such as wind speed and temperature based on Apriori, which can be directly applied to marine environment noise pollution control.
Background
The noise pollution of the ocean environment is huge and difficult to control, and is increasingly paid more attention by various countries and organizations. The influencing factors mainly comprise sailing, wind, rainfall, hydrology, marine life and the like. These factors are the main sources of marine environmental noise in different frequency bands, and the difference in the spatial-temporal distribution of the noise sources causes the environmental noise levels to vary greatly. Noise pollution of marine environments has long been ignored because it is invisible, but it has had its serious effects on marine life activities and human underwater acoustic communications. Therefore, it is necessary to systematically develop a research on a mining method of correlation between marine environmental noise and hydrometeorology such as wind speed and temperature for a typical sea area, and provide a support for controlling marine environmental noise pollution.
In the past, the correlation between the marine environmental noise and the hydrometeorology is mostly researched, and the influence frequency band and the influence range of the hydrometeorology on the marine environmental noise are obtained by combining with measured data and using conventional statistics, correlation analysis and other methods on the basis of researching the influence mechanism of different hydrometeorology elements (such as wind, water temperature and the like) on the marine environmental noise. The method needs strong domain knowledge, influences of other factors are not easy to eliminate when influences of a single hydrological meteorological element on marine environment noise are researched, in addition, only a rough range of influences of the hydrological meteorological element on the marine environment noise spectrum level of a specific frequency band can be given by conventional statistics, correlation analysis and other methods, and accuracy of prediction of the marine environment noise spectrum level is influenced.
Data mining is a technique for extracting potentially valuable knowledge hidden in massive, random, uncertain data, wherein association rule mining is one of the main contents of data mining, and the purpose of association rule mining is to discover associations between data items. The Apriori algorithm is a classical algorithm in association rule mining, and the basic idea is to use a layer-by-layer iterative search method to find a frequent set according to the minimum support degree set by a user and then generate an association rule from the frequent set. The Apriori algorithm is simple in thought, easy to implement and capable of being well used for discovering knowledge hidden in a large amount of marine environment data.
The core problem to be solved by the invention is how to mine the correlation between marine environment noise and hydrological weather based on Apriori algorithm. According to the method, a frequent item set with the support degree larger than the minimum support degree set by a user is found out based on an Apriori algorithm by utilizing a large amount of acquired marine environmental noise and hydrological meteorological data, then an association rule with the confidence degree larger than the minimum confidence degree set by the user is found from the frequent item set, and finally, the association relation between the marine environmental noise and the hydrological meteorological elements is excavated.
Disclosure of Invention
The invention aims to excavate the incidence relation influencing the spectral level change of the marine environment noise based on the Apriori mining method of the incidence relation between the marine environment noise and the hydrometeorology so as to solve the problems in the background technology.
The technical scheme of the invention is as follows:
1. continuously acquiring a large amount of marine environment noise data, synchronously acquiring hydrological meteorological data such as wind speed and temperature, cutting the acquired marine environment noise time domain signal into a plurality of sections according to a certain time rule, carrying out FFT analysis on the preprocessed marine environment noise signal, wherein the frequency resolution is 1Hz, and obtaining the marine environment noise spectrum level through calculation.
2. The method comprises the steps of utilizing a fuzzy C-means clustering method to carry out hierarchical representation on the spectrum level of the noise of the marine environment, the wind speed and the temperature data, namely dividing the data into a plurality of intervals, uniformly replacing the data falling in a certain interval with a specific symbol, for example, the wind speed is between 0.1 and 3.3m/s, dividing the data into 3 intervals, and respectively representing the 3 intervals by FS _1 (wind speed _ low), FS _2 (wind speed _ medium) and FS _3 (wind speed _ high). And then obtaining a marine environment transaction set T according to the hierarchical characterization result of the marine environment data (any record in the marine environment database is called as a transaction).
3. And (3) mapping the marine environment transaction set T obtained in the step (2), wherein the mapping method is to see whether each marine environment transaction item contains elements in the item set, if the marine environment transaction item contains the corresponding elements, the marine environment transaction item is marked as 1, and if the marine environment transaction item does not contain the corresponding elements, the marine environment transaction item is 0, so that a 0-1 matrix consisting of all marine environment data items can be obtained.
4. And (3) mining the association relation of the 0-1 matrix of the marine environment data obtained in the step (3) based on an Apriori algorithm, and excavating the association rule of the marine environment noise and the hydrological weather.
5. And screening and analyzing the obtained association rule according to the minimum support degree and the minimum confidence coefficient threshold value set by the user to find out the strong association relation between the marine environment noise and the hydrological weather.
The invention has the advantages that: according to the method, continuously observed marine environment data are utilized, the correlation relationship between marine environment noise and the hydrological weather is mined through an Apriori algorithm, the influence of a certain hydrological weather element on the change of the spectral level of the marine environment noise can be mined according to a specific purpose, the confidence coefficient of the influence of the hydrological weather element on the spectral level of the marine environment noise in a specific frequency band is given, and the accuracy of the prediction of the spectral level of the marine environment noise is improved.
Drawings
FIG. 1 is a flow chart of the correlation between the noise of the marine environment and the hydrometeorology.
FIG. 2 shows a schematic diagram of a frequent itemset in a marine environment transaction set.
Detailed Description
The following describes the embodiments of the present invention in detail with reference to the technical solutions and the attached fig. 1.
1. Continuously acquiring a large amount of marine environment noise signals, synchronously acquiring hydrological meteorological data such as wind speed and temperature, preprocessing the marine environment noise signals, and eliminating the noise signals with obvious interference. And then, cutting the preprocessed marine environment noise time domain signal into a plurality of sections according to a certain time rule, performing windowing sliding FFT (fast Fourier transform) processing on each section of data, and obtaining a marine environment noise power spectrum as the marine environment noise spectrum level of the time section through calculation. Let 1 st data at tiTemporal marine environmental noise signal pl(ti) To p forl(ti) Performing Fourier transform with the formula
Plk=Pl(fk)=FFT(pl(ti)) (1)
Wherein f isk=kfs/N,k=1,2,…,N,fsFor sampling frequency, N for each segment is the number of data points.
2. By utilizing the ocean environment noise spectrum level, the wind speed and the temperature data set and using a fuzzy C-means clustering method to respectively carry out hierarchical representation on the data, the specific method is that the data are divided into 3 intervals, the data falling into a certain interval are uniformly replaced by a specific symbol, the marine environment noise spectrum level is described as three levels PINPU _1 (spectrum _ low), PINPU _2 (spectrum _ in), PINPU _3 (spectrum _ high), the wind speed is described as three levels FS _1 (wind speed _ low), FS _2 (wind speed _ in), FS _3 (wind speed _ high), the temperature is described as three levels WENDU _1 (temperature _ low), WENDU _2 (temperature _ in), WENDU _3 (temperature _ high), the maximum wind speed is described as three levels ZDFS _1 (maximum wind speed _ low), ZDFS _2 (maximum wind speed _ in), ZDFS _3 (maximum wind speed _ high). And obtaining a transaction list in the marine environment database (any record in the marine environment database is called as a transaction) according to the hierarchical characterization result of the marine environment data, wherein each transaction is represented by T as shown in the table 1.
TABLE 1 Marine Environment transaction set List representation
Transaction sequence number Marine Environment transaction item List
T1 FS_1,ZDFS_2,FX_1,WENDU_1,PINPU_1
T2 FS_1,ZDFS_1,FX_1,WENDU_1,PINPU_1
T3 FS_1,ZDFS_1,FX_1,WENDU_1,PINPU_1
T4 FS_1,ZDFS_1,FX_1,WENDU_1,PINPU_1
T5 FS_2,ZDFS_2,FX_1,WENDU_1,PINPU_1
T6 FS_1,ZDFS_2,FX_1,WENDU_1,PINPU_1
T7 FS_1,ZDFS_1,FX_2,WENDU_3,PINPU_1
T8 FS_2,ZDFS_2,FX_2,WENDU_1,PINPU_1
T9 FS_1,ZDFS_2,FX_1,WENDU_1,PINPU_1
... ...
T72 FS_3,ZDFS_3,FX_1,WENDU_3,PINPU_1
... ...
3. The discretization of the data is an important link before the association rule mining, if the value in the data set is continuous, the discretization is carried out before the mining, the marine environment data is mainly continuous data, the discretization of the data is needed, firstly, the transaction set T in the marine environment database is mapped, whether each marine environment transaction contains elements in the item set or not is judged, if the marine environment transaction contains the corresponding elements, the corresponding elements are marked as 1, otherwise, the corresponding elements are 0, and therefore a 0-1 matrix consisting of all marine environment transaction items can be obtained.
4. According to the minimum support degree and the minimum confidence threshold set by the user, mining the association relationship between the dispersed marine environment noise spectrum level and the hydrometeorological data by using an Apriori association rule mining method, as shown in fig. 2, the specific flow is as follows:
(1) and setting a minimum support degree min _ sup to 0.2 and a minimum confidence degree min _ conf to 0.5, wherein the support degree and the confidence degree are mainly used for investigating the degree of the correlation between the marine environmental noise and the hydrological weather. The formula for calculating the support degree is as follows:
Figure BSA0000219725040000051
equation (2) represents the probability that dataset a occurs in the set of marine environment transactions. The confidence coefficient is calculated by the formula:
Figure BSA0000219725040000052
equation (3) represents the conditional probability of the occurrence of data set B under the condition of the occurrence of data set a. Where A is { a ═ a1,a2,a3,...,an},B={b1,b2,b3,...,bnAnd the element set of the marine environmental noise and the hydrological weather.
(2) Scanning all marine environment and hydrometeorology data, generating a candidate 1 item set C1 and obtaining a count for each candidate item, deleting FX _3 and WENDU _3 items in the candidate set C1 due to the minimum support being set to 0.2, a frequent 1 item set L1 can be determined, which is composed of candidate 1 item sets equal to or greater than the minimum support.
(3) To find the set of frequent 2-item sets, L2, the algorithm uses the elements of L1 to combine with each other two by two to produce the set of candidate 2-item sets, C2, and then gets L2 based on the minimum support.
(4) The elements of L2 are combined with each other to produce a set C3 of candidate 3 item sets, which is then L3 according to the minimum support.
(5) The algorithm uses the elements of L3 to combine with each other to produce a set of candidate 4-term sets C4. L3 ∞ L3 { { FS _2, ZDFS _2, WENDU _1, PINPU _3} }, all this set of entries is deleted, according to the nature of Apriori's algorithm, because its subset { ZDFS _2, WENDU _1, pindu _3} is not frequent. Thus C4 is an empty set, so the algorithm terminates and finds all the frequent item sets. For frequent set L3, all non-empty subsets generated are { FS _2}, { ZDFS _2}, { WENDU _1}, { PINPU _3}, { FS _2, ZDFS _2}, { FS _2, WENDU _1}, { FS _2, PINPU _3}, { ZDFS _2, WENDU _1}, { ZDFS _2, PINPU _3}, { WENDU _1, PINPU _3 }.
(6) Calculating the confidence coefficient of each non-empty subset according to the frequent item set of the marine environment data obtained in the step (5), and finding out the association rule meeting the minimum confidence coefficient threshold value, namely
Figure BSA0000219725040000061
Here, a is the rule front piece and B is the rule back piece, and the strong association rule and its confidence are obtained, as shown in table 2.
TABLE 2 Association rules and confidence of noise in marine environment and hydrometeorology
Rule front piece Regular back-piece Confidence level
{FS_2,PINPU_3} ZDFS_2 94.11%
{FS_2,WENDU_1} ZDFS_2 93.75%
{ZDFS_2,PINPU_3} FS_2 88.89%
{ZDFS_2,WENDU_1} FS_2 75%
{FS_2,ZDFS_3} PINPU_3 72.73%
{FS_2,ZDFS_3} WENDU_1 68.18%
FS_2 {ZDFS_2,PINPU_3} 66.67%
FS_2 {ZDFS_2,WENDU_1} 62.5%
ZDFS_2 {FS_2,PINPU_3} 50%
According to the results of the association rules in table 2, FS _2 and ZDFS _3 have a large influence on PINPU _3, that is, the wind speed is between 1.5 and 2.1 (FS _2), the maximum wind speed is between 2.9 and 4.9 (ZDFS _3), the spectrum level of the marine environmental noise is in the range of 71.6117 to 76.0281, that is, the probability of the PINPU _3 level occurring is 72.73%, and the above threshold can predict the spectrum level of the marine environmental noise through the variation range of the wind speed.

Claims (1)

1. An Apriori-based marine environment noise and hydrological meteorological association mining method is characterized by comprising the following steps:
(1) continuously acquiring a large amount of marine environment noise data, synchronously acquiring hydrological meteorological data such as wind speed, temperature and the like, cutting the acquired marine environment noise time domain signal into a plurality of sections according to a certain time rule, performing FFT analysis on the preprocessed marine environment noise signal, and calculating to obtain a marine environment noise spectrum level;
(2) carrying out hierarchical representation on the marine environment noise spectrum level, the wind speed and the temperature data by using a fuzzy C-means clustering method, namely dividing the data into a plurality of intervals, uniformly replacing the data falling in a certain interval with a specific symbol, and then obtaining a marine environment transaction set T according to the hierarchical representation result of the marine environment data;
(3) mapping the marine environment transaction set T obtained in the step 2, wherein the mapping method comprises the steps of judging whether each marine environment transaction item contains elements in the item set, if so, marking the marine environment transaction item as 1, and if not, marking the marine environment transaction item as 0, so that a 0-1 matrix consisting of all marine environment data items can be obtained;
(4) mining the association relation of the 0-1 matrix of the marine environment data obtained in the step 3 based on an Apriori algorithm, and excavating the association rule of the marine environment noise and the hydrological weather;
(5) and screening and analyzing the obtained association rule according to the minimum support degree and the minimum confidence coefficient threshold value set by the user to find out the strong association relation between the marine environment noise and the hydrological weather.
CN202010977512.9A 2020-09-07 2020-09-07 Marine environment noise and hydrological meteorological association relation mining method based on Apriori Pending CN114153890A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010977512.9A CN114153890A (en) 2020-09-07 2020-09-07 Marine environment noise and hydrological meteorological association relation mining method based on Apriori

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010977512.9A CN114153890A (en) 2020-09-07 2020-09-07 Marine environment noise and hydrological meteorological association relation mining method based on Apriori

Publications (1)

Publication Number Publication Date
CN114153890A true CN114153890A (en) 2022-03-08

Family

ID=80462215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010977512.9A Pending CN114153890A (en) 2020-09-07 2020-09-07 Marine environment noise and hydrological meteorological association relation mining method based on Apriori

Country Status (1)

Country Link
CN (1) CN114153890A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023202008A1 (en) * 2022-04-19 2023-10-26 中国科学院声学研究所 Marine environment noise forecasting method, computer device, and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023202008A1 (en) * 2022-04-19 2023-10-26 中国科学院声学研究所 Marine environment noise forecasting method, computer device, and storage medium

Similar Documents

Publication Publication Date Title
Naghibi et al. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran
Xu et al. In situ SST quality monitor (iQuam)
Sang et al. The relation between periods’ identification and noises in hydrologic series data
Yang et al. An integrative hierarchical stepwise sampling strategy for spatial sampling and its application in digital soil mapping
Butler et al. Extreme value analysis of decadal variations in storm surge elevations
Alexander et al. Development of hybrid wavelet-ANN model for hourly flood stage forecasting
Dickinson et al. Seasonality of climatic drivers of flood variability in the conterminous United States
Zhu et al. Loess terrain segmentation from digital elevation models based on the region growth method
Snelder et al. Can bottom-up procedures improve the performance of stream classifications?
Pepler et al. Independently assessing the representation of midlatitude cyclones in high‐resolution reanalyses using satellite observed winds
Duda et al. Large-sample application of radar reflectivity object-based verification to evaluate HRRR warm-season forecasts
Beaugrand et al. An overview of statistical methods applied to CPR data
CN114153890A (en) Marine environment noise and hydrological meteorological association relation mining method based on Apriori
Şen et al. Point cumulative semivariogram of areal precipitation in mountainous regions
Fan et al. Comparison of earthquake-induced shallow landslide susceptibility assessment based on two-category LR and KDE-MLR
Prabhakaran et al. Investigating spatial heterogeneity within fracture networks using hierarchical clustering and graph distance metrics
Li et al. Revisiting the definition of rapid intensification of tropical cyclones by clustering the initial intensity and inner‐core size
Gourley et al. Comments on “Flash flood verification: Pondering precipitation proxies”
CN116595290A (en) Method for identifying key factors affecting chlorophyll change of marine physical elements
Mayer et al. Subseasonal forecasts of opportunity identified by an interpretable neural network
Jacques Describing and comparing variability of fish and macrozooplankton density at marine hydrokinetic energy sites
Gupta et al. Characterizing the tail behaviour of daily precipitation probability distributions over India using the obesity index
Tadivaka et al. Detection of ionospheric scintillation effects using LMD–DFA
Díaz et al. Hierarchical classification of snowmelt episodes in the Pyrenees using seismic data
Suman et al. Unveiling the climatic origin of streamflow persistence through multifractal analysis of hydro-meteorological datasets of India

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220308