CN109709604B - Method for selecting seismic event correlation detection algorithm with low error cost - Google Patents

Method for selecting seismic event correlation detection algorithm with low error cost Download PDF

Info

Publication number
CN109709604B
CN109709604B CN201811506637.2A CN201811506637A CN109709604B CN 109709604 B CN109709604 B CN 109709604B CN 201811506637 A CN201811506637 A CN 201811506637A CN 109709604 B CN109709604 B CN 109709604B
Authority
CN
China
Prior art keywords
algorithm
event
detection
errors
ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811506637.2A
Other languages
Chinese (zh)
Other versions
CN109709604A (en
Inventor
李健
王晓明
商杰
邱宏茂
刘哲函
王娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctbt Beijing National Data Centre
Original Assignee
Ctbt Beijing National Data Centre
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctbt Beijing National Data Centre filed Critical Ctbt Beijing National Data Centre
Priority to CN201811506637.2A priority Critical patent/CN109709604B/en
Publication of CN109709604A publication Critical patent/CN109709604A/en
Application granted granted Critical
Publication of CN109709604B publication Critical patent/CN109709604B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method for selecting a seismic event correlation detection algorithm with lower error cost, which belongs to the field of detection algorithm performance evaluation.

Description

Method for selecting seismic event correlation detection algorithm with low error cost
Technical Field
The invention belongs to the field of performance evaluation of detection algorithms, and can be used for performance evaluation of automatic detection algorithms of seismic events.
Background
The detection of the seismic event is a process of inversion forming an event according to the signals recorded by the monitoring station and the characteristics thereof, and generally comprises the processes of detection of the station signals, identification of seismic phases, association and positioning of multiple seismic phases and the like. Seismic event detection can be viewed as a two-class problem, with any detection system containing two classes of errors: one is missing inspection; a false detection. Reducing false positives can make the system more sensitive, but can increase false positives. Conversely, reducing false positives reduces system sensitivity and increases the risk of false negatives. Aiming at the problem of earthquake event detection, various detection algorithms are provided by the academic community, including a GA algorithm based on a global lattice point, a NET-VISA method based on a Bayesian probability model and the like, and an earthquake event correlation detection method with low error cost is required to be provided.
How to judge which algorithm is better and is more suitable for the requirement of earthquake event detection needs to compare and evaluate the performance of the detection algorithm.
In many studies of detection algorithm optimization, ROC curves are used for algorithm performance evaluation. The method plots true rate (TPR) and False Positive Rate (FPR) curves as algorithm sensitivities. Since the calculation of false positive rate requires the knowledge of the value of true negative, and for seismic event detection algorithms, the calculation of true negative requires the statistics of the number of false events formed by the possible combination of detections for all stations per day, which is difficult to calculate.
For the field of data retrieval and the like, Precision (Precision) and recall (recall) are generally adopted to evaluate the performance of the algorithm. Precision is defined as the proportion of data returned that is relevant to a query, and recall is defined as how much information of interest to the user is retrieved. And evaluating the performance of the detection algorithm by drawing a P-R curve, searching a balance point, calculating a harmonic mean value and the like. The method can be used as a performance evaluation index of a seismic event detection algorithm because the calculation of the recall ratio and the precision ratio does not need to know the value of a true counter example. However, the detection process of the seismic event is complex, the number of parameters is large, and the common methods such as P-R curve, balance point, harmonic mean and the like of the algorithm are difficult to draw, and are not suitable for performance measurement of the algorithm. Therefore, the performance evaluation method and indexes are designed, the performance of the algorithm is reflected more reasonably, intuitively and reliably, and the method is an urgent requirement for optimization research of the earthquake event detection algorithm.
Disclosure of Invention
The invention aims to provide a method for selecting an automatic earthquake event correlation detection algorithm with better performance, which detects earthquake events through a plurality of automatic earthquake event correlation detection algorithms, judges and evaluates results of earthquake time detection of various methods, determines the recall ratio and precision ratio of the earthquake event correlation detection method, and quantitatively compares detection results of various methods through cost function curves in two-dimensional spaces of the recall ratio and the precision ratio to obtain the earthquake event correlation detection method with lower error cost.
The technical scheme of the invention is as follows: a method for a seismic event correlation detection algorithm with low error cost, comprising the steps of:
1) acquiring signals acquired by a plurality of seismograph sensors in various regions of the world;
2) outputting signals acquired by a seismograph sensor to a seismic event detection system, wherein the seismic event detection system adopts an automatic seismic event detection algorithm to detect the signals, identify seismic phases, associate a plurality of seismic phases and position seismic events to obtain seismic event information;
3) the earthquake event detection system adopts other automatic detection algorithms of various earthquake events to respectively obtain earthquake event information corresponding to the automatic detection algorithms of various earthquake events;
4) for the event bulletin produced by various earthquake event automatic detection algorithms, the daily recall ratio and the daily precision ratio of the detection algorithm are calculated based on the defined event matching rule by taking the reference event bulletin as a standard;
5) establishing a precision ratio-recall ratio two-dimensional coordinate space diagram, referred to as a P-R space for short, by taking the precision ratio as a vertical axis and the recall ratio as a horizontal axis, and drawing the daily recall ratio and the precision ratio obtained by calculation in the P-R space;
6) drawing a cost function curve in a P-R space to realize the visual comparison of various algorithm results;
the concrete contents are as follows:
6.1 the cost function relationship is established as follows:
Cα=α*(1-R)+R*(1-P)/P
wherein C isαTaking the precision ratio P and the recall ratio R as variables for the cost metric value of the cost function; determining a weight value alpha of a weight parameter of the function;
6.2 draw cost function curve in P-R recall: drawing cost metric value CαTaking a function curve between the precision ratio P and the recall ratio R of a plurality of corresponding variables when a plurality of determined values are obtained;
6.3 comparing the results of the daily recall ratio and the daily precision ratio of each algorithm in the P-R space to be positioned at the position of the cost function curve; if its position corresponds to the cost metric CαThe smaller the algorithm performance the better.
Preferably, if the cost function curve of a certain algorithm is corresponding to the cost metric value CαThe smaller the weight value is, and the more 1 the weight value is selected, the fewer errors of the type I of the algorithm are; if the cost metric value C corresponding to the cost metric curve of a certain algorithmαThe smaller, and the weight value chosen to be less than 1, the fewer class II errors for the algorithm,type I errors are missed detection events, and type II errors are false detection events.
Preferably, different weight values are selected for different types I and II errors, when the weight value is greater than 1, less types I errors are emphasized, and when the weight value is less than 1, less types II errors are emphasized, namely the weight value of the weight parameter of the cost function is determined; type I errors are missed detection events, and type II errors are false detection events.
Preferably, the event matching criteria based on the definition are: the arrival difference is <30s and the position difference is <2 degrees.
The invention has the beneficial effects that:
the method solves the performance evaluation problem of the seismic data processing algorithm, and has the advantages of simplicity, intuition, comprehensiveness, reliability and the like; the detection cost of the algorithm is measured by adopting a weighted cost function curve, and different emphasis of the algorithm on the missed detection event and the false detection event can be reflected. Defining a weighted cost function for earthquake detection algorithm evaluation, drawing a cost function curve in a P-R two-dimensional space diagram, and visually reflecting the performance of the earthquake detection algorithm by using the cost curve values of the daily recall ratio and the daily precision ratio of the detection result.
Drawings
FIG. 1 is a graph illustrating a cost function curve in a P-R two-dimensional space;
FIG. 2 is a diagram illustrating a cost function curve comparison according to an embodiment of the present invention;
Detailed Description
The invention is further illustrated by the following figures and examples.
A method for detecting an algorithm by correlating seismic events with low error cost is characterized by comprising the following steps: which comprises the following steps:
1) acquiring signals acquired by a plurality of seismograph sensors in various regions of the world;
2) outputting signals acquired by a seismograph sensor to a seismic event detection system, wherein the seismic event detection system adopts an automatic seismic event detection algorithm to detect the signals, identify seismic phases, associate a plurality of seismic phases and position seismic events to obtain seismic event information;
3) the earthquake event detection system adopts other automatic detection algorithms of various earthquake events to respectively obtain earthquake event information corresponding to the automatic detection algorithms of various earthquake events;
4) for the event bulletin produced by various earthquake event automatic detection algorithms, the daily recall ratio and the daily precision ratio of the detection algorithm are calculated based on the defined event matching rule by taking the reference event bulletin as a standard.
5) And establishing a precision ratio-recall ratio two-dimensional coordinate space diagram, namely a P-R space for short, by taking the precision ratio as a vertical axis and the recall ratio as a horizontal axis, and drawing the daily recall ratio and the precision ratio obtained by calculation in the P-R space.
6) And drawing a cost function curve in a P-R space to realize the visual comparison of various algorithm results.
The specific implementation mode is as follows:
6.1 design cost function to evaluate the algorithm detection performance. The cost function is defined as type I errors (undetected events) and type II errors (false detected events) that may be introduced per detection of a true event. A ratio of a sum of a product of the number of type I error events and the weight and a number of type II error events to a standard time total;
the formula is as follows:
Cα(α + number of type I error events + number of type II error events)/total number of criteria events
=α*(1-R)+R*(1-P)/P
Determining a weight value alpha of a weight parameter of the formula; (in the formula, different weights are selected for different types I and II errors, and when the weight is greater than 1, less types I errors are emphasized, and when the weight is less than 1, less types II errors are emphasized, that is, the weight parameter of the cost function is determined.)
With CαAs function value, the precision ratio P and the recall ratio R are variables, and C is establishedαAnd P, R, where CαReferred to as a cost function;
6.2 plotting cost function curves (in contour mode) in P-R recall: plotting a function curve among a plurality of P, R when the cost metric takes a plurality of determined values;
6.3 comparing the results of the daily recall ratio and the precision ratio of each algorithm in the P-R space to be positioned at the position of the cost function curve; if the measure C of the cost function curve of an algorithm isαThe smaller, and when the weight selection in step 6.1 is greater than 1, the fewer errors in the algorithm type I; if the cost value C of the cost function of a certain algorithm isαThe smaller and the weight selection in step 6.1 is less than 1, the fewer errors of the algorithm type II.
The dots in FIG. 1 represent the results of hypothetical method one, and the pentagons represent the results of hypothetical method two, which has a smaller cost curve value than method two.
Steps 4), 5), 6) are specifically defined as follows:
4) for the event bulletin generated by the earthquake event detection algorithm, the daily recall ratio and the daily precision ratio of the detection algorithm are calculated and recorded as the unit time of each day based on the reference event bulletin as a standard and defined event matching standards (such as arrival time difference <30s and position difference <2 degrees). Precision (Precision) is defined as the probability that the prediction is correct in the sample predicted as a positive example; recall (Recall) is defined as the probability that a positive sample is predicted to be correct. The calculation formulas of the daily precision and the daily recall ratio are as follows:
Figure RE-GDA0001992306990000051
Figure RE-GDA0001992306990000052
wherein TP (true positive): true and true every day; FP (false positive): false positive case every day; fn (false negative): false counterexample was taken every day.
5) And constructing a Precision-Recall (Precision-Recall) two-dimensional space diagram, namely a P-R space, by taking the Precision as a vertical axis and the Recall as a horizontal axis, and drawing the daily Recall and the Precision obtained by calculation in the P-R space.
6) And designing a cost function to evaluate the detection performance of the algorithm. The cost function is defined as type I errors (undetected events) and type II errors (false detected events) that may be introduced per detection of a true event. Meanwhile, different weights are distributed to the type I and type II errors so as to reflect the emphasis of the system. The cost function is defined as:
Cα(α + number of type I error events + number of type II error events)/total number of criteria events
=α*(1-R)+R*(1-P)/P
The cost function is plotted in a P-R diagram according to equation (4). Each curve representing a cost value Cαα is a weight value, α>1 indicates that type I errors are more important than type II errors, α<1 and vice versa.
The principle is as follows: the smaller the cost value is, the better the cost value is, the comparison result of the daily recall ratio and the daily precision ratio of each algorithm in the P-R space diagram is positioned in the area of the cost function curve, and the corresponding CαThe smaller the cost value, the better the algorithm performance.
Intuitively, as in the figure, the closer to the upper right corner of the P-R graph, the corresponding CαThe smaller.
The basic idea of the invention is to calculate the recall ratio and precision ratio index of the algorithm, draw a cost function curve in a P-R two-dimensional space diagram in a contour line mode, and quantitatively and intuitively realize the performance evaluation of the algorithm by comparing the position of the result of the algorithm on the cost function curve.
For example: two seismic event correlation algorithms are evaluated by adopting the method, one is a NET-VISA method based on a Bayesian model, and the other is a GA method based on global lattice point correlation. The two methods are processed off-line by using Data of a year which completely forbids an International Monitoring System (IMS), and the two algorithms are compared and evaluated by taking an International Data Center (IDC) manual review bulletin (REB) as a reference event.
The precision and recall per day of each algorithm for the year are first calculated with defined event matching criteria (arrival time difference <30 s; position difference <2 degrees). The precision ratio is used as the ordinate, the recall ratio is used as the abscissa, and the above calculation results are drawn in a P-R two-dimensional space diagram, as shown in FIG. 2.
In fig. 2, the five-pointed star represents the daily recall ratio and precision ratio of the NET-VISA algorithm, the dots represent the daily recall ratio and precision ratio of the GA algorithm, and the dot size represents the number of daily reference events; from the figure, NET-VISA algorithm has lower cost curve values than GA generally by quantitative comparison.
The weight value of the weighted cost function is alpha-10, the evaluation of the algorithm is mainly focused on low omission factor, a cost function curve is drawn in a P-R graph in a contour line mode, the recall ratio and precision ratio result of the algorithm are compared to fall into the cost function curve value, and the NET-VISA result which is intuitively seen in the graph 2 has lower cost value (generally positioned at the upper right of the GA result), so that the NET-VISA performance is better than the GA on the whole. By utilizing the method, the evaluation result of the earthquake event correlation algorithm is comprehensively, intuitively and reliably given.

Claims (4)

1. A method for selecting a seismic event correlation detection algorithm with a low error cost is characterized by comprising the following steps: which comprises the following steps:
1) acquiring signals acquired by a plurality of seismograph sensors in various regions of the world;
2) outputting signals acquired by a seismograph sensor to a seismic event detection system, wherein the seismic event detection system adopts an automatic seismic event detection algorithm to detect the signals, identify seismic phases, associate a plurality of seismic phases and position seismic events to obtain seismic event information;
3) the earthquake event detection system adopts other automatic detection algorithms of various earthquake events to respectively obtain earthquake event information corresponding to the automatic detection algorithms of various earthquake events;
4) for the event bulletin produced by various earthquake event automatic detection algorithms, the daily recall ratio and the daily precision ratio of the detection algorithm are calculated based on the defined event matching rule by taking the reference event bulletin as a standard;
5) establishing a precision-recall ratio two-dimensional coordinate space diagram, referred to as a P-R space for short, by taking the precision ratio as a vertical axis and the recall ratio as a horizontal axis, and drawing the daily recall ratio and the daily recall ratio obtained by calculation in the P-R space;
6) drawing a cost function curve in a P-R space to realize the visual comparison of various algorithm results;
the concrete contents are as follows:
6.1 the cost function relationship is established as follows:
Cα=α*(1-R)+R*(1-P)/P
wherein C isαTaking the precision ratio P and the recall ratio R as variables for the cost metric value of the cost function; determining a weight value alpha of a weight parameter of the function;
6.2 draw cost function curve in P-R recall: drawing cost metric value CαTaking a function curve between the precision ratio P and the recall ratio R of a plurality of corresponding variables when a plurality of determined values are obtained;
6.3 comparing the results of the daily recall ratio and the daily precision ratio of each algorithm in the P-R space to be positioned at the position of the cost function curve; if its position corresponds to the cost metric CαThe smaller the algorithm performance, the better;
wherein, the precision ratio is defined as the probability of correct prediction in the sample predicted as a positive example; recall is defined as the probability that the positive example sample is predicted to be correct.
2. The method of selecting a less error costly seismic event correlation detection algorithm of claim 1, wherein: if the cost metric value C corresponding to the cost function curve of a certain algorithmαThe smaller the weight value is, and the more 1 the weight value is selected, the fewer errors of the type I of the algorithm are; if the cost metric value C corresponding to the cost metric curve of a certain algorithmαThe smaller the error is, and when the weight value is selected to be less than 1, the type II errors of the algorithm are fewer, the type I errors are missed detection events, and the type II errors are false detection events.
3. The method of selecting a less error costly seismic event correlation detection algorithm of claim 1, wherein: selecting different weighted values for different types I and II errors, wherein when the weighted value is greater than 1, less types I errors are emphasized, and when the weighted value is less than 1, less types II errors are emphasized, namely determining the weighted value of the weighted parameter of the cost function; type I errors are missed detection events, and type II errors are false detection events.
4. The method of selecting a less error costly seismic event correlation detection algorithm of claim 1, wherein: the event matching criteria based on the definition are: the arrival difference is <30s and the position difference is <2 degrees.
CN201811506637.2A 2018-12-10 2018-12-10 Method for selecting seismic event correlation detection algorithm with low error cost Active CN109709604B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811506637.2A CN109709604B (en) 2018-12-10 2018-12-10 Method for selecting seismic event correlation detection algorithm with low error cost

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811506637.2A CN109709604B (en) 2018-12-10 2018-12-10 Method for selecting seismic event correlation detection algorithm with low error cost

Publications (2)

Publication Number Publication Date
CN109709604A CN109709604A (en) 2019-05-03
CN109709604B true CN109709604B (en) 2020-11-06

Family

ID=66256275

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811506637.2A Active CN109709604B (en) 2018-12-10 2018-12-10 Method for selecting seismic event correlation detection algorithm with low error cost

Country Status (1)

Country Link
CN (1) CN109709604B (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103164417A (en) * 2011-12-12 2013-06-19 国家电网公司 Geological information processing method and device
US9448313B2 (en) * 2012-02-06 2016-09-20 Ion Geophysical Corporation Integrated passive and active seismic surveying using multiple arrays
US9348047B2 (en) * 2012-12-20 2016-05-24 General Electric Company Modeling of parallel seismic textures
US20170248715A1 (en) * 2014-09-22 2017-08-31 Cgg Services Sas Simultaneous multi-vintage time-lapse full waveform inversion
KR101768714B1 (en) * 2016-05-23 2017-08-17 한국지질자원연구원 Method for increasing accuracy of epicenter location determination using removing outlier of triggered observation station
KR101697227B1 (en) * 2016-05-23 2017-01-17 한국지질자원연구원 Method for determining epicenter location in order to prevent false alarms of an earthquake early warning system using forced association of adjacent observation station information

Also Published As

Publication number Publication date
CN109709604A (en) 2019-05-03

Similar Documents

Publication Publication Date Title
Bifet et al. Pitfalls in benchmarking data stream classification and how to avoid them
Fauvel et al. A distributed multi-sensor machine learning approach to earthquake early warning
CN108829535A (en) Data processing method, terminal and computer readable storage medium
US20060074828A1 (en) Methods and apparatus for detecting temporal process variation and for managing and predicting performance of automatic classifiers
CN104931960A (en) Trend message and radar target state information whole-track data correlation method
CN1749987A (en) Methods and apparatus for managing and predicting performance of automatic classifiers
WO2006014509A2 (en) Quantitative pcr data analysis system (qdas)
CN107679734A (en) It is a kind of to be used for the method and system without label data classification prediction
WO2006014464A2 (en) Method for quantitative pcr data analysis system (qdas)
CN112633412A (en) Abnormal electricity consumption detection method, equipment and storage medium
CN106792883A (en) Sensor network abnormal deviation data examination method and system
US7552035B2 (en) Method to use a receiver operator characteristics curve for model comparison in machine condition monitoring
CN115392408A (en) Method and system for detecting abnormal operation of electronic particle counter
CN110186854A (en) A kind of foodsafety information detecting method
CN109165665A (en) A kind of category analysis method and system
CN116415931A (en) Big data-based power equipment operation state monitoring method and system
CN107132515A (en) A kind of point mark screening technique constrained based on multidimensional information
CN109709604B (en) Method for selecting seismic event correlation detection algorithm with low error cost
CN1749988A (en) Methods and apparatus for managing and predicting performance of automatic classifiers
CN111879456B (en) Building curtain wall safety detection method and system
CN109782342B (en) Method for selecting seismic event correlation detection algorithm with better performance
CN116051185B (en) Advertisement position data abnormality detection and screening method
CN112766301A (en) Similarity judgment method for indicator diagram of oil extraction machine
CN116910526A (en) Model training method, device, communication equipment and readable storage medium
CN113255820B (en) Training method for falling-stone detection model, falling-stone detection method and related device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant