CN113219499A - Position time series abnormity detection method and device and computer storage medium - Google Patents

Position time series abnormity detection method and device and computer storage medium Download PDF

Info

Publication number
CN113219499A
CN113219499A CN202110373143.7A CN202110373143A CN113219499A CN 113219499 A CN113219499 A CN 113219499A CN 202110373143 A CN202110373143 A CN 202110373143A CN 113219499 A CN113219499 A CN 113219499A
Authority
CN
China
Prior art keywords
data
position time
gnss position
local
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110373143.7A
Other languages
Chinese (zh)
Inventor
舒颖
闵阳
曹成度
滕焕乐
吴石军
李威
贺小星
马龙
郑跃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Railway Siyuan Survey and Design Group Co Ltd
Original Assignee
China Railway Siyuan Survey and Design Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Railway Siyuan Survey and Design Group Co Ltd filed Critical China Railway Siyuan Survey and Design Group Co Ltd
Priority to CN202110373143.7A priority Critical patent/CN113219499A/en
Publication of CN113219499A publication Critical patent/CN113219499A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/01Satellite radio beacon positioning systems transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/03Cooperating elements; Interaction or communication between different cooperating elements or between cooperating elements and receivers
    • G01S19/07Cooperating elements; Interaction or communication between different cooperating elements or between cooperating elements and receivers providing data for correcting measured positioning data, e.g. DGPS [differential GPS] or ionosphere corrections
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/01Satellite radio beacon positioning systems transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/13Receivers
    • G01S19/24Acquisition or tracking or demodulation of signals transmitted by the system
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/01Satellite radio beacon positioning systems transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/13Receivers
    • G01S19/24Acquisition or tracking or demodulation of signals transmitted by the system
    • G01S19/25Acquisition or tracking or demodulation of signals transmitted by the system involving aiding data received from a cooperating element, e.g. assisted GPS
    • G01S19/256Acquisition or tracking or demodulation of signals transmitted by the system involving aiding data received from a cooperating element, e.g. assisted GPS relating to timing, e.g. time of week, code phase, timing offset
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/40Correcting position, velocity or attitude

Abstract

The invention discloses an anomaly detection method, an anomaly detection device and a computer storage medium for a Global Navigation Satellite System (GNSS) position time sequence, wherein the method comprises the steps of performing gross error detection processing on an original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information; performing interpolation processing on the first GNSS position time sequence to obtain a second GNSS position time sequence; performing local anomaly detection processing on the second GNSS position time sequence based on preset parameters, and determining a local anomaly factor of first data in the second GNSS position time sequence; the first data is any data in the second GNSS position time series; determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data.

Description

Position time series abnormity detection method and device and computer storage medium
Technical Field
The present invention relates to the technical field of Global Navigation Satellite System (GNSS) data verification, and in particular, to a method and an apparatus for detecting an anomaly in a GNSS position time sequence, and a computer storage medium.
Background
The GNSS position time sequence contains abundant physical information, such as long-term linear change trend of tectonic stress in an observation station affected area, and nonlinear change caused by geophysical effect and other external environments. Under various errors such as ionospheric delay, multipath effect, environmental load, orbit anomaly, etc., the GNSS observation station makes the GNSS position time series exhibit nonlinear variation and also has an unavoidable abnormal value (data in the GNSS position time series). The abnormal values enable the GNSS position time sequence to show large variation fluctuation in the whole or local data, the variation trend of the GNSS position time sequence cannot be accurately reflected, and large negative influence is generated on data analysis and application in the GNSS position time sequence. Therefore, the detection of the outlier of the GNSS position time series is an important step in data processing, and most of the research at present is to detect the outlier of the global data of the GNSS position time series, but the detection effect of the outlier of the local data of the GNSS position time series is not ideal.
Disclosure of Invention
In view of the above, the main objective of the present invention is to provide a method, an apparatus, and a computer storage medium for detecting an anomaly of a GNSS position time sequence, which can well detect a sudden change of local data of the GNSS position time sequence, obtain a more accurate GNSS position time sequence, and facilitate obtaining a higher-precision modeling, so as to better study a potential change rule of the GNSS position time sequence, and have significant economic and social benefits.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
in a first aspect, the present invention provides a method for anomaly detection of a GNSS position time series of a global navigation satellite system, the method comprising:
performing gross error detection processing on the original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information;
performing interpolation processing on the first GNSS position time sequence to obtain a second GNSS position time sequence;
performing local anomaly detection processing on the second GNSS position time sequence based on preset parameters, and determining a local anomaly factor of first data in the second GNSS position time sequence; the first data is any data in the second GNSS position time series;
determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data.
In the foregoing solution, the performing a coarse detection process on an original GNSS position time sequence to obtain a first GNSS position time sequence includes:
determining gross error abnormal data; the gross error abnormal data is data which is not in a set range in the time sequence of the original GNSS position;
and removing the gross error abnormal data from the original GNSS position time sequence to obtain the first GNSS position time sequence.
In the foregoing solution, the determining gross error abnormal data includes:
determining a first quartile and a third quartile in the raw GNSS position time series; determining a quartile distance corresponding to the original GNSS position time sequence according to the first quartile and the third quartile;
determining the gross error abnormal data based on the first quartile, the third quartile and the quartile distance;
the first quartile is data in the 25 th percent after data in the original GNSS position time sequence are sequenced from small to large; the third quartile is the data in 75% after the data in the original GNSS position time sequence are sequenced from small to large; the quartile distance is a difference between the third quartile and the first quartile.
In the foregoing solution, the determining the gross error abnormal data based on the first quartile, the third quartile and the quartile range includes:
determining a first critical value based on the first quartile and the quartile range, and determining a second critical value based on the third quartile; the first critical value is not greater than the second critical value;
determining the gross error anomaly data based on the first critical value and the second critical value; wherein the gross error anomaly data is data in the raw GNSS location time series that is not between the first threshold and the second threshold.
In the foregoing solution, the performing local anomaly detection processing on the second GNSS position time series based on preset parameters to determine a local anomaly factor of the first data in the second GNSS position time series includes:
determining a kth distance neighborhood of the first data based on a preset parameter; k is the preset parameter;
determining a plurality of second data within the k-th distance neighborhood; the second data is any data in the second GNSS position time series;
determining local reachable density corresponding to the first data and determining local reachable density corresponding to each second data;
and determining a local abnormal factor of the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each second data.
In the foregoing solution, the determining the local reachable density corresponding to the first data includes:
determining the reachable distance of the first data and each second data;
determining the quotient of the sum of each of the reachable distances and the number of second data in the k-th distance neighborhood; and the reciprocal of the quotient of the sum of each reachable distance and the number of second data in the k-th distance neighborhood is the local reachable density corresponding to the first data.
In the foregoing solution, the determining a local anomaly factor of the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each of the second data includes:
determining the sum of the ratio of the local reachable density corresponding to each second data to the local reachable density corresponding to the first data;
obtaining the ratio of the sum of the local reachable density ratios to the number of second data in the k-th distance neighborhood; the ratio is a local anomaly factor of the first data.
In the foregoing solution, the determining local anomaly data in the second GNSS position time series based on the local anomaly factor of the first data includes:
when the original GNSS position time sequence meets a preset condition, determining first data corresponding to N local abnormal factors; the first data corresponding to the N local anomaly factors are local anomaly data corresponding to the second GNSS position time sequence; wherein each local anomaly factor of the N local anomaly factors is not less than the local anomaly factor corresponding to other data in the second GNSS location time series; n is a positive integer and is less than the number of data contained in the second GNSS location time series;
alternatively, the determining local anomaly data in the second GNSS position time series based on the local anomaly factor of the first data comprises:
when the original GNSS position time sequence does not meet the preset condition, determining the number of local abnormal factors to be acquired according to a set percentage;
determining first data corresponding to the local abnormal factor; the number of the first data corresponds to the number of the local anomaly factors; the number of first data is local anomaly data in the second GNSS position time series; each local anomaly factor is not smaller than the local anomaly factor corresponding to other data in the second GNSS position time sequence; the set percentage is not greater than 5%.
In the above aspect, the method further includes:
removing the local abnormal data in the second GNSS position time sequence to obtain a third GNSS position time sequence;
and carrying out interpolation processing on the third GNSS position time sequence to obtain a target GNSS position time sequence.
In a second aspect, the present invention further provides an apparatus for anomaly detection of GNSS position time series of global navigation satellite systems, the apparatus comprising: a first obtaining unit, a second obtaining unit, a first determining unit, and a second determining unit, wherein,
the first obtaining unit is configured to perform gross error detection processing on the original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information;
the second obtaining unit is configured to perform interpolation processing on the first GNSS position time series to obtain a second GNSS position time series;
the first determining unit is configured to perform local anomaly detection processing on the second GNSS position time series based on preset parameters, and determine a local anomaly factor of first data in the second GNSS position time series; the first data is any data in the second GNSS position time series;
the second determining unit is configured to determine local anomaly data in the second GNSS position time series based on the local anomaly factor of the first data.
In the foregoing aspect, the first obtaining unit includes: the device comprises a first obtaining module and a second obtaining module, wherein the first obtaining module is used for determining gross error abnormal data; the gross error abnormal data is data which is not in a set range in the time sequence of the original GNSS position; the second obtaining module is configured to remove the gross error abnormal data from the original GNSS position time sequence to obtain the first GNSS position time sequence.
In the above scheme, the first obtaining module is specifically configured to determine a first quartile and a third quartile in the time series of the original GNSS position; determining a quartile distance corresponding to the original GNSS position time sequence according to the first quartile and the third quartile; determining the gross error abnormal data based on the first quartile, the third quartile and the quartile distance; the first quartile is data in the 25 th percent after data in the original GNSS position time sequence are sequenced from small to large; the third quartile is the data in 75% after the data in the original GNSS position time sequence are sequenced from small to large; the quartile distance is a difference between the third quartile and the first quartile.
In the above solution, the first obtaining module is further configured to determine a first critical value based on the first quartile and the quartile distance, and determine a second critical value based on the third quartile; the first critical value is not greater than the second critical value;
determining the gross error anomaly data based on the first critical value and the second critical value; wherein the gross error anomaly data is data in the raw GNSS location time series that is not between the first threshold and the second threshold.
In the above scheme, the first determining unit includes a first determining module, a second determining module, a third determining module, and a fourth determining module, where the first determining module is configured to determine a kth distance neighborhood of the first data based on a preset parameter; k is the preset parameter;
the second determining module is configured to determine a plurality of second data within the kth distance neighborhood; the second data is any data in the second GNSS position time series;
the third determining module is configured to determine local reachable densities corresponding to the first data and determine local reachable densities corresponding to each of the second data;
the fourth determining module is configured to determine a local anomaly factor of the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each of the second data.
In the foregoing solution, the third determining module is specifically configured to determine an reachable distance between the first data and each of the second data; determining the quotient of the sum of each of the reachable distances and the number of second data in the k-th distance neighborhood; and the reciprocal of the quotient of the sum of each reachable distance and the number of second data in the k-th distance neighborhood is the local reachable density corresponding to the first data.
In the foregoing solution, the fourth determining module is specifically configured to: determining the sum of the ratio of the local reachable density corresponding to each second data to the local reachable density corresponding to the first data; obtaining the ratio of the sum of the local reachable density ratios to the number of second data in the k-th distance neighborhood; the ratio is a local anomaly factor of the first data.
In the above scheme, the second determining unit is configured to determine first data corresponding to N local anomaly factors when the time sequence of the original GNSS position meets a preset condition; the first data corresponding to the N local anomaly factors are local anomaly data corresponding to the second GNSS position time sequence; wherein each local anomaly factor of the N local anomaly factors is not less than the local anomaly factor corresponding to other data in the second GNSS location time series; n is a positive integer and is less than the number of data contained in the second GNSS location time series; or, the second determining unit is configured to determine, according to a set percentage, the number of local anomaly factors that need to be acquired when the original GNSS position time sequence does not meet a preset condition; determining first data corresponding to the local abnormal factor; the number of the first data corresponds to the number of the local anomaly factors; the number of first data is local anomaly data in the second GNSS position time series; each local anomaly factor is not smaller than the local anomaly factor corresponding to other data in the second GNSS position time sequence; the set percentage is not greater than 5%.
In the above solution, the apparatus further comprises: a culling unit and an interpolation unit, wherein,
the removing unit is configured to remove the local abnormal data in the second GNSS position time sequence to obtain a third GNSS position time sequence;
and the interpolation unit is used for carrying out interpolation processing on the third GNSS position time sequence to obtain a target GNSS position time sequence.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored; the computer program, when executed by a processor, implements the steps of any of the methods described above.
In a fourth aspect, an embodiment of the present invention provides an abnormality detection apparatus, including: a processor and a memory for storing a computer program operable on the processor, wherein the processor is operable to perform the steps of any of the above methods when executing the computer program.
The embodiment of the invention provides a method and a device for detecting an anomaly of a GNSS position time sequence and a computer storage medium. Wherein the method comprises the following steps: performing gross error detection processing on the original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information; performing interpolation processing on the first GNSS position time sequence to obtain a second GNSS position time sequence; performing local anomaly detection processing on the second GNSS position time sequence based on preset parameters, and determining a local anomaly factor of first data in the second GNSS position time sequence; the first data is any data in the second GNSS position time series: determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data. According to the method, the local abnormal data in the original GNSS position time sequence are obtained through the primary gross error detection processing and the secondary local abnormal detection processing, the local abnormal data are further removed from the position time sequence, and the accurate values which are estimated according to a certain algorithm are supplemented, so that the more accurate GNSS position time sequence is obtained, the modeling with higher precision is facilitated to be obtained, the potential change rule of the GNSS position time sequence is conveniently and better researched, and the method has remarkable economic and social benefits.
Drawings
FIG. 1 is a flowchart illustrating an anomaly detection method for a GNSS location time sequence of a global navigation satellite system according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating a result of performing abnormal value detection by using IQR on a time sequence of positions in a GNSS observation station in a reception elevation direction according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a result obtained by performing gross error exception processing on a position time sequence in a GNSS observation station in a receiving elevation direction by using IQR and then performing interpolation by using KNN according to an embodiment of the present invention;
fig. 4 is a schematic diagram illustrating a result of an abnormal value detection experiment performed by using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 3 according to an embodiment of the present invention;
fig. 5 is a schematic diagram illustrating a result of an abnormal value detection experiment performed by using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 5 according to an embodiment of the present invention;
fig. 6 is a schematic diagram illustrating a result of an abnormal value detection experiment performed by using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 7 according to an embodiment of the present invention;
fig. 7 is a schematic diagram illustrating a result of an abnormal value detection experiment performed by using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 10 according to an embodiment of the present invention;
fig. 8 is a schematic diagram illustrating a result of performing an outlier detection experiment using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 20 according to an embodiment of the present invention;
fig. 9 is a schematic diagram illustrating a result of performing an outlier detection experiment using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 30 according to an embodiment of the present invention;
fig. 10 is a graph illustrating a relationship between local anomaly factors and local reachable densities corresponding to data in a time series of actual measured GNSS elevation positions according to an embodiment of the present invention;
fig. 11 is a diagram illustrating a result obtained by performing outlier detection using LOF on a time series of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 10 and outputting a local outlier by 1% in the top according to the embodiment of the present invention;
fig. 12 is a schematic diagram illustrating a result obtained by performing outlier detection using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 10 and outputting a local outlier by 2% in the top case according to the embodiment of the present invention;
FIG. 13 is a graph comparing the original data and the data processed by the IQR-LOF according to the embodiment of the present invention;
FIG. 14 is an enlarged fragmentary view of FIG. 13 within the dashed box;
FIG. 15 is a flowchart illustrating a complete anomaly detection method according to an embodiment of the present invention;
FIG. 16 is a schematic structural diagram of an anomaly detection apparatus for a GNSS location time sequence of a global navigation satellite system according to an embodiment of the present invention;
fig. 17 is a schematic diagram of a hardware structure of an abnormality detection apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the following describes specific technical solutions of the present invention in further detail with reference to the accompanying drawings in the embodiments of the present invention. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
For the detection of abnormal values of a GNSS position time sequence, classical four-quadrant distances (IQR) (Inter Quartile Range) and a 3 sigma criterion are still common abnormal value detection methods at present, wherein the 3 sigma criterion is suitable for data with less gross error and accurate prior models, but the estimation of standard deviation is biased as the data of the abnormal values is increased or the degree of outlier is larger; the IQR method is more robust to outlier detection relative to the 3 σ criterion, with the quartile range and median being less affected by gross errors. In recent years, there have been many scholars who propose new methods or make improvements over the conventional methods. Klos A and the like analyze the influence of abnormal value detection on a noise model by using different standard deviation criteria, and propose to use an absolute centering error method to remove gross errors; the Mingfeng et al provides a L1 norm and IQR combined detection method, and the detection accuracy of the modeled residual error is higher; wuhao et al propose a wavelet improved 3 sigma gross error detection method, which can accurately extract the deformation trend and improve the applicability of 3 sigma in coordinate data. The caidao army et al uses multi-channel singular spectrum analysis to perform accurate gross error detection on a time series of coordinates, and the method does not require prior knowledge about the time series. The research promotes the development of coordinate time series data abnormal value detection, but some problems still exist, and most of the traditional methods are used for carrying out abnormal value detection on overall data, but the detection effect on the mutation of local data is not ideal.
Combining the advantages and disadvantages of the above outlier detection method and combining the data characteristics of the GNSS position time series, the embodiment of the present invention establishes an outlier detection method based on Local Outlier Factor (LOF), which: firstly, the effectiveness of the method is verified through simulation data, a classical IQR method is introduced to improve the applicability of the LOF in a GNSS position time sequence aiming at the complexity and nonlinearity of actually measured data, and the data processed by the IQR-LOF method can obtain data with higher precision and more stability.
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flowchart illustrating an anomaly detection method for a GNSS position time series of a global navigation satellite system according to an embodiment of the present invention. As shown in fig. 1, the method includes:
s101: performing gross error detection processing on the original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information;
it should be noted that the raw GNSS position time series may also be referred to as GNSS coordinate time series, which refers to data transmitted by a GNSS observation station receiving a satellite, where the data at least includes position information (or coordinate information) and time information. The GNSS observation station is influenced by various factors such as external environment, system interior and the like when receiving the GNSS position time sequence, so that the original GNSS position time sequence has the characteristics of instability, nonlinearity and the like, and data in the original GNSS position time sequence contains various noises and errors, and the original GNSS position time sequence needs to be processed to a certain extent so as to obtain a cleaner original GNSS position time sequence, so that the potential change rule of the GNSS position time sequence can be better researched, and the GNSS observation station has remarkable economic benefit and social benefit.
Specifically, in some embodiments, for S101, the method includes:
determining gross error abnormal data; the gross error abnormal data is data which is not in a set range in the time sequence of the original GNSS position;
and removing the gross error abnormal data from the original GNSS position time sequence to obtain the first GNSS position time sequence.
Here, the raw GNSS position time series may be subjected to anomaly detection processing using an Inter Quartile Range (IQR) to obtain gross anomaly data in correspondence with the raw GNSS position time series,
specifically, obtaining gross error abnormal data in the time sequence of the original GNSS position by using a quartile range includes:
determining a first quartile and a third quartile in the raw GNSS position time series; determining a quartile distance corresponding to the original GNSS position time sequence according to the first quartile and the third quartile;
determining the gross error abnormal data based on the first quartile, the third quartile and the quartile distance;
the first quartile is data in the 25 th percent after data in the original GNSS position time sequence are sequenced from small to large; the third quartile is the data in 75% after the data in the original GNSS position time sequence are sequenced from small to large; the quartile distance is a difference between the third quartile and the first quartile.
It should be noted that the determining the first quartile and the third quartile in the raw GNSS position time series includes: arranging the data in the original GNSS position time sequence according to a sequence from small to big; the 25 th% of the data in the sequence is the first quartile; the 50 th% of the data is the second quartile; the 75% of the data is a third quartile, wherein the second quartile is the median of the data in the sequence; the difference between the third Quartile and the first Quartile is also called as an interquartile Range (IQR), which can be expressed as:
IQR=Q3-Q1 (1)
wherein Q is3Is a third quartile representing data at 75% of the GNSS position time series arranged in order from small to large; q1Is a first quartile representing data at 25% in a time series of GNSS positions arranged in order from small to large; wherein (Q)1,Q3) The in-range data covers the middle 50% of the data contained in the raw GNSS position time series, so that the median of the data contained in the raw GNSS position time series and the data IQR are less affected by gross errors.
In some embodiments, said determining said gross error anomaly data based on said first quartile, said third quartile, and said quartile range comprises:
determining a first critical value based on the first quartile and the quartile range, and determining a second critical value based on the third quartile; the first critical value is not greater than the second critical value;
determining the gross error anomaly data based on the first critical value and the second critical value; wherein the gross error anomaly data is data in the raw GNSS location time series that is not between the first threshold and the second threshold.
It is to be noted that the data in the original GNSS position time series falls within the data range composed of the first critical value and the second critical value and is a normal value; the data in the original GNSS position time sequence is not an abnormal value in a data range formed by a first critical value and a second critical value, wherein the normal value represents that the data outlier degree in the original GNSS position time sequence meets a set condition; the abnormal value indicates that the degree of outlier of the data in the raw GNSS position time series does not satisfy a predetermined condition, wherein the predetermined condition may be a predetermined condition set by a tester, for example, the data in the raw GNSS position time series falling within a data range formed by the first threshold and the second threshold may be an alternative embodiment of the predetermined condition.
Specifically, the first threshold may be the first quartile minus 1.5 times the IQR, which may beTo be represented as Q1-1.5 IQR; the second critical value may be the third quartile plus 1.5 times IQR, which may be expressed as Q3+1.5 IQR. In this case, the gross error anomaly data corresponding to the raw GNSS position time series is absent (Q) from the raw GNSS position time series1-1.5*IQR,Q3+1.5 IQR).
For example, as shown in fig. 2, fig. 2 is a schematic diagram illustrating a result of performing outlier detection by using IQR on a time sequence of positions in an elevation direction received by a GNSS observation station according to an embodiment of the present invention. In the figure, the ordinate symbol "up" represents a vertical distance in the elevation direction, and the unit is millimeter (mm). FIG. 2 shows the case where IQR is used and the condition is set such that the data falls in (Q)1-1.5*IQR,Q3+1.5 IQR), then the data in the time series of positions in the elevation direction fall on (Q)1-1.5*IQR,Q3+1.5 IQR) is the normal value, i.e., the data point within the dashed box line; data in the positional time series in the elevation direction is absent (Q)1-1.5*IQR,Q3+1.5 IQR) are outliers, i.e.: gross error outliers, i.e., data points outside the dashed outline.
S102: performing interpolation processing on the first GNSS position time sequence to obtain a second GNSS position time sequence;
it should be noted that the first GNSS position time series obtained in the foregoing step is an initial GNSS position time series after the gross error abnormal data is removed, and it is necessary to interpolate the first GNSS position time series after the abnormal value is removed, so that an adverse effect of the removed gross error abnormal data on the analysis of the original GNSS position time series can be reduced. In practical applications, there are many interpolation methods that can be used, for example, a piecewise cubic spline method, a lagrange interpolation method, a K-Nearest Neighbor (KNN, K-Nearest Neighbor) interpolation method, and the like, where as an implementable manner, the KNN interpolation method can avoid that the subsequent steps in the interpolation data of the step are judged to be abnormal values, and the interpolation method has the further advantage that the method can be applied to continuously missing data, and can fill all missing values at one time, and can better conform to the trend characteristics of the original data.
Specifically, the KNN interpolation process is as follows: for a first GNSS position time sequence needing interpolation, K observations closest to the first GNSS position time sequence are calculated based on Euclidean distance, then filling values are calculated by data of K adjacent observations in a distance inverse weighting mode, and finally the missing values are replaced by the filling values. The value of K may be selected according to actual requirements, for example, in some embodiments, the value of K may be 5. Here, the expression of the euclidean distance is:
Figure BDA0003010137960000131
wherein D is two observation data in the position time sequence (the coordinates are respectively (x)1,y1)、(x2,y2) Euclidean distance between them, where the point with the shortest euclidean distance with respect to an observation is its nearest neighbor. Assuming that the position time series adopted in the embodiment of the present invention is a two-dimensional time series, the nearest neighboring points may be determined according to the serial numbers of the observation days, if the data of the serial number 188 of the observation days is missing, the data of the serial numbers 187 and 189 are two nearest neighboring points of the serial number 188, at this time, the data filling value in the position time series of the serial number 188 in the elevation direction is the average value of the data in the position time series of the serial numbers 187 and 189 in the elevation direction, the data of the serial numbers 186 to 190 are 4 nearest neighboring points of the serial number 188, the data filling value in the position time series of the serial number 188 in the elevation direction is the average value of the data in the position time series of the serial numbers 186, 187, 189, 190 in the elevation direction, and so on.
For example, as shown in fig. 3, fig. 3 is a schematic diagram of a result obtained by performing coarse error exception processing on a time sequence of positions in a GNSS observation station in a receiving elevation direction by using IQR and then performing interpolation by using KNN according to an embodiment of the present invention. The diagram shows the first GNSS position time series interpolated using KNN interpolation with K being 5. The positions of the triangular symbols in fig. 3 are the interpolation values in the position time series.
S103: performing local anomaly detection processing on the second GNSS position time sequence based on preset parameters, and determining a local anomaly factor of first data in the second GNSS position time sequence; the first data is any data in the second GNSS position time series;
it should be noted that, the Local anomaly detection processing may adopt Local anomaly Factor (LOF) anomaly detection, which is an anomaly detection algorithm based on the density of data points, and the core idea is to mainly determine whether a certain point p is an anomaly point by comparing the densities of the point p and its neighboring points, and if the density of the point p is smaller relative to the density of its neighboring points, the point is more likely to be considered as an anomaly point. Based on this idea, the specific process of detecting the LOF abnormal value in this embodiment may be: calculating a local anomaly factor LOF for each data in the second GNSS position time series, and judging whether the data point is an anomaly value point according to the LOF of each data point, wherein the specific calculation process comprises the following steps: firstly, calculating the reachable distance of each data point in the second GNSS position time sequence, calculating the local reachable density of each data point according to the reachable distance, finally obtaining each data LOF in the second GNSS position time sequence according to the local reachable densities, and then judging whether each data is abnormal according to the LOF of each data, wherein the method specifically comprises the following steps: if the LOF value is close to 1, the local reachable density of the data point is similar to the local reachable density of the neighborhood point, and the more likely the data point is in the same cluster with the neighborhood point, the data point is not an abnormal point; when the LOF value is greater than 1, it indicates that the local reachable density of the data point is less than the local reachable density of the domain point, and the data point may be an outlier point. The LOF abnormal value detection method mainly comprises the following definition and calculation:
definition 1 (k-proximity distance) in the data set D, among the data points closest to the data point p, the distance between the k-th closest data point and the point p is called the k-proximity distance (k-distance) of the point p and is denoted as Dk(p); the distance between two data points p and o is denoted as d (p, o).
Definition 2 (reachable distance) when given the parameter k, the reachable distance of the data point p to the point o is the maximum of the k-neighborhood distance of the data point and the direct distance between the data point p and the point o, which can be written as:
reach-distk(p,o)=max{k-distance(o),d(p,o)} (2)
wherein, reach-distk(p, o) is the reachable distance of data point p to point o.
Defining a k-distance domain, denoted N, of a set of all data points, 3 (local reachable density) whose distance from point p is less than or equal to k-distance (p), called data point pk(p); the local reachable density of point p is the inverse of the average reachable distance of point p from the data points in the k-distance neighborhood, i.e.:
Figure BDA0003010137960000151
wherein, lrdk(p) is the local achievable density of data point p.
Defining 4 (local anomaly factor) a local anomaly factor for a data point p as the ratio of the average local reachable density of the k-distance neighborhood of the data point p to the local reachable density of the data point p, i.e.:
Figure BDA0003010137960000152
the local anomaly factor is used for representing the degree of outlier of the data point p, and when the data point p is distant from other data points, the local reachable density of the data point p is small.
Based on this definition, the step of determining the local anomaly factor for each data in the second GNSS position time series in this embodiment is as follows:
in some embodiments, the performing local anomaly detection processing on the second GNSS position time series based on preset parameters to determine a local anomaly factor of the first data in the second GNSS position time series includes:
determining a kth distance neighborhood of the first data based on a preset parameter; k is the preset parameter;
determining a plurality of second data within the k-th distance neighborhood; the second data is any data in the second GNSS position time series;
determining local reachable density corresponding to the first data and determining local reachable density corresponding to each second data;
and determining a local abnormal factor of the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each second data.
It should be noted that the preset parameter is one value of k, and since different values of k have different influences on the final result, the value of k needs to be preset when the LOF anomaly detection is adopted. The kth distance neighborhood of the first data corresponds to the kth distance neighborhood of the data point p; the second data corresponds to the aforementioned data point o, and is also data in the k-th distance neighborhood of the first data.
In some embodiments, the determining the local reachable density to which the first data corresponds includes:
determining the reachable distance of the first data and each second data;
determining the quotient of the sum of each of the reachable distances and the number of second data in the k-th distance neighborhood; and the reciprocal of the quotient of the sum of each reachable distance and the number of second data in the k-th distance neighborhood is the local reachable density corresponding to the first data.
It should be noted that the calculation of the reachable distance and the local reachable density is as described above, and is not described herein again.
In some embodiments, the determining a local anomaly factor for the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each of the second data includes:
determining the sum of the ratio of the local reachable density corresponding to each second data to the local reachable density corresponding to the first data;
obtaining the ratio of the sum of the local reachable density ratios to the number of second data in the k-th distance neighborhood; the ratio is a local anomaly factor of the first data.
It should be noted that the local anomaly factor is calculated as described above, and is not described herein again. In order to illustrate the influence of different k values on the final result, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, and fig. 9 are schematic diagrams illustrating the results of performing an abnormal value detection experiment by using an LOF method after performing the aforesaid IQR and KNN preprocessing on the position time sequence observed by the GNSS observation station, and selecting k to be 3, 5, 7, 10, 20, and 30. Wherein the symbol "x" represents an outlier; the time series of coordinates of only the top 20 local anomaly factors are shown in each figure because it is difficult to observe the effect of different k values on the detection results when too many outliers are shown. After the preliminary test, the invention considers that the k value is within 10 to keep good detection effect.
S104: determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data.
In some embodiments, the determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data comprises:
when the original GNSS position time sequence meets a preset condition, determining first data corresponding to N local abnormal factors; the first data corresponding to the N local anomaly factors are local anomaly data corresponding to the second GNSS position time sequence; wherein each local anomaly factor of the N local anomaly factors is not less than the local anomaly factor corresponding to other data in the second GNSS location time series; n is a positive integer and is less than the number of data contained in the second GNSS location time series;
alternatively, in further embodiments, the determining local anomaly data in the second GNSS position time series based on the local anomaly factor for the first data comprises: when the original GNSS position time sequence does not meet the preset condition, determining the number of local abnormal factors to be acquired according to a set percentage;
determining first data corresponding to the local abnormal factor; the number of the first data corresponds to the number of the local anomaly factors; the number of first data is local anomaly data in the second GNSS position time series; each local anomaly factor is not smaller than the local anomaly factor corresponding to other data in the second GNSS position time sequence; the set percentage is not greater than 5%.
It should be noted that, in an actual observation process, a GNSS observation station may be affected by an external environment or geological motion, and may fluctuate the observed GNSS position time sequence to different degrees, so that a situation that a part of data points are locally outlier is not caused by an observation error, and this part of data is true in a physical sense, so when performing abnormal value detection using an LOF, it is necessary to determine local abnormal data according to whether an original GNSS position time sequence satisfies a set condition, where the set condition may be formulated to combine a distribution situation of data points of the position time sequence observed by the GNSS observation station and determine according to a specific use purpose, specifically, the set condition is to determine the number of local abnormal data according to the distribution situation of the original GNSS position time sequence data points and according to the specific use purpose: namely:
(1) when the original GNSS position time sequence meets the preset condition, that is, the characteristics of the original GNSS position time sequence are relatively stable, the local abnormal value is detected by outputting a plurality of data points with large local abnormal factors, so that the characteristics of the original GNSS position time sequence can be prevented from being damaged.
(2) When the original GNSS position time sequence does not satisfy the preset condition, that is, the characteristics of the original GNSS position time sequence are not robust and a cleaner GNSS position time sequence needs to be obtained, the raw GNSS position time sequence may be eliminated according to a set percentage of the number of data included in the GNSS position time sequence, it needs to be noted that a large number of data points whose local anomaly factors are greater than 1 exist in the actually measured GNSS elevation position time sequence, an alternative embodiment is shown in fig. 10, and fig. 10 is a graph of a relationship between local anomaly factors corresponding to each data in the actually measured GNSS elevation position time sequence and local achievable density provided by the embodiment of the present invention. In this case, it is not reasonable to judge all data points having a local abnormality factor greater than 1 as abnormal value points. Statistically, data is not available when data is missing up to 5%. Moreover, the method provided in the embodiment of the present invention eliminates part of the data by using the IQR method before the LOF detection is performed, and thus for a safe data processing, elimination of not more than 5% of the total data is required, for example, elimination is performed according to an abnormal value of 1% or 2% of the total data.
It should be noted that, when the number of abnormal values to be output is small (i.e. in the first case, the original GNSS position time sequence meets the preset condition), the LOF algorithm can detect more accurately; in order to verify the effect when a large number of abnormal values need to be verified and output, the embodiment of the present invention provides the effect shown in fig. 11 and 12 when outputting about the first 1% and 2% of data points of the local abnormal factor, and fig. 11 is a schematic diagram of the result that when the preset parameter provided by the embodiment of the present invention is k value 10, the local abnormal factor is output by the first 1% of the local abnormal factor by performing the abnormal value detection by using LOF for the position time series in the GNSS observation station receiving elevation direction; fig. 12 is a schematic diagram of a result obtained by performing outlier detection using LOF for a time sequence of positions in a GNSS observation station in a reception elevation direction when a preset parameter is a k value of 10 and outputting a local outlier by 2%, where a symbol "x" represents an outlier. According to the two graphs, the LOF algorithm can better detect abnormal values of local data, and the detected abnormal values are mostly positioned at the upper edge and the lower edge of the data, so that a cleaner coordinate time sequence is obtained.
In some embodiments, the method further comprises:
s105: removing the local abnormal data in the second GNSS position time sequence to obtain a third GNSS position time sequence;
s106: and carrying out interpolation processing on the third GNSS position time sequence to obtain a target GNSS position time sequence.
Wherein the target GNSS location time series is a location time series with outliers removed and accurate values inferred by the interpolation. The target GNSS position time sequence can accurately reflect the potential change rule of the GNSS position time sequence.
It should be noted that, as can be seen from the above analysis, the LOF algorithm can effectively detect local abnormal values in the GNSS elevation coordinate time series, and obtain a more real and cleaner coordinate time series, but it is inevitable that a misjudgment is generated in the detection process. Although some misjudged data can be found through visual observation, the data processing time is greatly increased, larger errors possibly caused by subjective analysis can be caused, the misjudged data are selected and removed, data interpolation is uniformly carried out after the removal, and the influence caused by the misjudgment can be weakened or even counteracted through a proper interpolation method. The data of 2% is selected as an abnormal value to be removed, the removed data is interpolated by using a KNN algorithm to obtain final processed data, a comparison graph of the original data and the data processed by the IQR-LOF is shown in FIG. 13, the processed data not only keeps the trend change of the original data, but also is more stable as a whole. In order to clearly show the comparison between the original data and the original data processed by the IQR-LOF, fig. 14 provides a partially enlarged view of the dotted frame in fig. 13, wherein the data encircled by the circle in fig. 14 is the data to be removed from the original data after the IQR-LOF processing. From this figure 14 it can be seen more clearly that the GNSS position time series obtained with this method is much smoother.
For understanding the embodiment of the present invention, fig. 15 is a flowchart illustrating a complete anomaly detection method according to the embodiment of the present invention. As shown in fig. 15, the method includes:
s1501: acquiring GNSS coordinate time sequence original data D;
s1502: carrying out initial abnormal value detection on the original data D by adopting the IQR to obtain gross error data;
s1503: rejecting the gross error data;
s1504: interpolating the vacant data by adopting a K-nearest neighbor interpolation method to obtain data D1;
s1505: performing LOF local abnormal value detection on the D1 according to preset parameters by adopting an LOF method, and outputting a local abnormal factor corresponding to each data in the D1;
s1506: selecting N points with the highest local abnormal factors as local abnormal value points according to the characteristics and requirements of the data D;
s1507: and removing the N points as local abnormal value points, and performing interpolation by adopting KNN again to obtain the processed D2.
S1502 and S1503 correspond to S101; s1504 corresponds to S102; s1505 corresponds to S103; s1506 corresponds to S104 described above; s1504 is equivalent to S105 and S106, and thus the embodiment herein is a detailed refinement of the foregoing method, and detailed explanation of the specific steps is not repeated herein.
The embodiment of the invention provides an anomaly detection method of a GNSS position time sequence, which obtains local anomaly data in an original GNSS position time sequence through primary gross error detection processing and secondary local anomaly detection processing, further eliminates the local anomaly data from the original position time sequence, and supplements an accurate value which is estimated according to a certain algorithm so as to obtain a more accurate GNSS position time sequence, is beneficial to obtaining higher-precision modeling, is convenient to better research the potential change rule of the GNSS position time sequence, and has remarkable economic benefit and social benefit. In other words, the embodiment of the invention provides an abnormal value detection method combining four-quadrant distance (IQR) and local abnormal factor (LOF) for solving the problem that local abnormal value detection is difficult due to the instability and nonlinearity of a GNSS elevation coordinate time series, and performs an abnormal value detection experiment on simulation data and GNSS elevation measured data respectively. Firstly, an abnormal value detection experiment is carried out on simulation data by using an LOF algorithm, and the result shows that the LOF algorithm is more accurate in detection of the abnormal value compared with the traditional method and has better sensitivity on a local abnormal value; in the actually measured data, aiming at the problem that the number of abnormal values is difficult to select, the IQR is introduced on the basis of the LOF, and the actually measured elevation time sequence is tested, and the result shows that the IQR-LOF combination method can reasonably select the number of the abnormal values, can obtain modeling data with higher precision, is beneficial to researching the potential change rule of the GNSS coordinate time sequence, and has remarkable economic benefit and social benefit.
Compared with the prior art, the embodiment of the invention has the following advantages:
(1) the GNSS observation station is influenced by various factors such as external environment, system interior and the like during measurement, so that the coordinate time sequence of the GNSS observation station has the characteristics of instability, nonlinearity and the like, and data comprises various noises and errors. Large outliers in the data can be removed as gross errors, but the seasonal change in the elevation direction is obvious, so that the outliers of local data are difficult to detect by a traditional method; the LOF algorithm measures the abnormal degree of a data point, and calculates the relative density of the data point and surrounding adjacent data points instead of looking at the absolute local density of the point.
(2) When detecting an abnormal value point, the LOF judges the abnormal value according to the size of a local abnormal factor, but due to the complexity and the variability of a GNSS coordinate time sequence, the specific abnormal value quantity in the actually measured data cannot be accurately judged, and the selection of a proper threshold value is very difficult to detect the local abnormal value, so that the classical IQR method is used for primary detection, a large outlier and a part of abnormal values in the data can be removed, and the selection of the abnormal value quantity is facilitated during the LOF detection.
(3) After the abnormal values are eliminated, the data are necessary to be interpolated, and the adverse effect of eliminating gross errors on data analysis can be weakened by a proper interpolation method. The interpolation method can comprise the following steps: the piecewise cubic spline method is used for interpolation, Lagrange interpolation and the like, but in order to be suitable for continuous missing data and avoid the interpolation data in S102 being judged as abnormal values in S1064), a KNN interpolation algorithm can be selected, and the method has the advantages that all missing values can be filled at one time, and the trend characteristics of original data can be well met. According to the method, the GNSS coordinate time sequence is processed by utilizing the constructed IQR-LOF abnormal value combination detection method according to the principle of LOF and the characteristics of GNSS coordinate time sequence data, and finally, the GNSS coordinate time sequence is interpolated by using a KNN algorithm, so that a novel method for detecting the local abnormal value of the GNSS coordinate time sequence is created.
Based on the same inventive concept as above, fig. 16 is a schematic structural diagram of an anomaly detection apparatus for GNSS location time series of global navigation satellite system according to an embodiment of the present invention, where the apparatus 15 includes: a first obtaining unit 1601, a second obtaining unit 1602, a first determining unit 1603, and a second determining unit 1604, wherein,
the first obtaining unit 1601 is configured to perform coarse-difference detection processing on an original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information;
the second obtaining unit 1602, configured to perform interpolation processing on the first GNSS position time sequence to obtain a second GNSS position time sequence;
the first determining unit 1603 is configured to perform local anomaly detection processing on the second GNSS position time series based on preset parameters, and determine a local anomaly factor of first data in the second GNSS position time series; the first data is any data in the second GNSS position time series;
the second determining unit 1604 for determining local anomaly data in the second GNSS position time series based on the local anomaly factor of the first data
In some embodiments, the first obtaining unit comprises: the device comprises a first obtaining module and a second obtaining module, wherein the first obtaining module is used for determining gross error abnormal data; the gross error abnormal data is data which is not in a set range in the time sequence of the original GNSS position; the second obtaining module is configured to remove the gross error abnormal data from the original GNSS position time sequence to obtain the first GNSS position time sequence.
In some embodiments, the first obtaining module is specifically configured to determine a first quartile and a third quartile in the raw GNSS position time series; determining a quartile distance corresponding to the original GNSS position time sequence according to the first quartile and the third quartile; determining the gross error abnormal data based on the first quartile, the third quartile and the quartile distance; the first quartile is data in the 25 th percent after data in the original GNSS position time sequence are sequenced from small to large; the third quartile is the data in 75% after the data in the original GNSS position time sequence are sequenced from small to large; the quartile distance is a difference between the third quartile and the first quartile.
In some embodiments, the first obtaining module is further configured to determine a first threshold based on the first quartile and the quartile range, and a second threshold based on the third quartile; the first critical value is not greater than the second critical value;
determining the gross error anomaly data based on the first critical value and the second critical value; wherein the gross error anomaly data is data in the raw GNSS location time series that is not between the first threshold and the second threshold.
In some embodiments, the first determining unit includes a first determining module, a second determining module, a third determining module, and a fourth determining module, wherein the first determining module is configured to determine a kth distance neighborhood of the first data based on a preset parameter; k is the preset parameter;
the second determining module is configured to determine a plurality of second data within the kth distance neighborhood; the second data is any data in the second GNSS position time series;
the third determining module is configured to determine local reachable densities corresponding to the first data and determine local reachable densities corresponding to each of the second data;
the fourth determining module is configured to determine a local anomaly factor of the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each of the second data.
In some embodiments, the third determining module is specifically configured to determine an reachable distance between the first data and each of the second data; determining the quotient of the sum of each of the reachable distances and the number of second data in the k-th distance neighborhood; and the reciprocal of the quotient of the sum of each reachable distance and the number of second data in the k-th distance neighborhood is the local reachable density corresponding to the first data.
In some embodiments, the fourth determining module is specifically configured to: determining the sum of the ratio of the local reachable density corresponding to each second data to the local reachable density corresponding to the first data; obtaining the ratio of the sum of the local reachable density ratios to the number of second data in the k-th distance neighborhood; the ratio is a local anomaly factor of the first data.
In some embodiments, the second determining unit is configured to determine first data corresponding to N local anomaly factors when the raw GNSS position time series meets a preset condition; the first data corresponding to the N local anomaly factors are local anomaly data corresponding to the second GNSS position time sequence; wherein each local anomaly factor of the N local anomaly factors is not less than the local anomaly factor corresponding to other data in the second GNSS location time series; n is a positive integer and is less than the number of data contained in the second GNSS location time series; or, the second determining unit is configured to determine, according to a set percentage, the number of local anomaly factors that need to be acquired when the original GNSS position time sequence does not meet a preset condition; determining first data corresponding to the local abnormal factor; the number of the first data corresponds to the number of the local anomaly factors; the number of first data is local anomaly data in the second GNSS position time series; each local anomaly factor is not smaller than the local anomaly factor corresponding to other data in the second GNSS position time sequence; the set percentage is not greater than 5%.
In the above solution, the apparatus further comprises: a culling unit and an interpolation unit, wherein,
the removing unit is configured to remove the local abnormal data in the second GNSS position time sequence to obtain a third GNSS position time sequence;
and the interpolation unit is used for carrying out interpolation processing on the third GNSS position time sequence to obtain a target GNSS position time sequence.
It should be noted that the anomaly detection apparatus for GNSS position time series of global navigation satellite system according to the present invention is the same as the aforementioned method, and therefore, details of how each unit in the apparatus works are not described herein again. The anomaly detection device obtains local anomaly data in the original GNSS position time sequence through primary gross error detection processing and secondary local anomaly detection processing, further eliminates the local anomaly data from the original position time sequence, supplements an accurate value which is estimated according to a certain algorithm so as to obtain a more accurate GNSS position time sequence, is beneficial to obtaining higher-precision modeling, is convenient to better research the potential change rule of the GNSS position time sequence, and has remarkable economic benefit and social benefit.
Embodiments of the present invention further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the foregoing method embodiments, and the foregoing storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
An embodiment of the present invention further provides an anomaly detection apparatus, where the apparatus includes: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is configured to execute the steps of the above-described method embodiments stored in the memory when running the computer program.
Fig. 17 is a schematic diagram of a hardware structure of an abnormality detection apparatus according to an embodiment of the present invention, where the abnormality detection apparatus 170 includes: the at least one processor 1701, the memory 1702, and optionally the abnormality detection device 170 may further include at least one communication interface 1703, with the various components of the abnormality detection device 170 coupled together by a bus system 1704, it being understood that the bus system 1704 is used to enable communications among the components. The bus system 1704 includes a power bus, a control bus, and a status signal bus in addition to the data bus. For clarity of illustration, however, the various buses are designated in FIG. 17 as the bus system 1704.
It will be appreciated that the memory 1702 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Among them, the nonvolatile Memory may be a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a magnetic Random access Memory (FRAM), a magnetic Random access Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Compact Disc Read-Only Memory (CD-ROM); the magnetic surface storage may be disk storage or tape storage. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Synchronous Random Access Memory), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), Synchronous link Dynamic Random Access Memory (SLDRAM, Synchronous Dynamic Random Access Memory), Direct Memory (DRMbus Random Access Memory, Random Access Memory). The memory 1702 described in connection with the embodiments of the invention is intended to comprise, without being limited to, these and any other suitable types of memory.
The memory 1702 of embodiments of the present invention is used to store various types of data to support the operation of the anomaly detection apparatus 170. Examples of such data include: any computer program for operating on the anomaly detection device 170, such as an implementation for determining a decision pattern matching the first bit based on the forward decision result and the backward decision result, etc., may be embodied in the memory 1702 for implementing a method according to an embodiment of the present invention.
The methods disclosed in the embodiments of the present invention described above may be applied to the processor 1701 or implemented by the processor 1701. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor described above may be a general purpose Processor, a Digital SigNal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed by the embodiment of the invention can be directly implemented by a hardware decoding processor, or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium having a memory and a processor reading the information in the memory and combining the hardware to perform the steps of the method.
In an exemplary embodiment, the anomaly detection Device 170 may be implemented by one or more ApplicatioN Specific INtegrated Circuits (ASICs), DSPs, PrograMMable Logic Devices (PLDs), CoMplex PrograMMable Logic Devices (CPLDs), Field PrograMMable Gate Arrays (FPGAs), general purpose processors, controllers, Micro Controllers (MCUs), microprocessors (microprocessors), or other electronic components for performing the above-described methods.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (12)

1. A method for anomaly detection of a global navigation satellite system GNSS position time series, the method comprising:
performing gross error detection processing on the original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information;
performing interpolation processing on the first GNSS position time sequence to obtain a second GNSS position time sequence;
performing local anomaly detection processing on the second GNSS position time sequence based on preset parameters, and determining a local anomaly factor of first data in the second GNSS position time sequence; the first data is any data in the second GNSS position time series;
determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data.
2. The method of claim 1, wherein performing a gross error detection process on the raw GNSS position time series to obtain a first GNSS position time series comprises:
determining gross error abnormal data; the gross error abnormal data is data which is not in a set range in the time sequence of the original GNSS position;
and removing the gross error abnormal data from the original GNSS position time sequence to obtain the first GNSS position time sequence.
3. The method of claim 2, wherein determining gross anomaly data comprises:
determining a first quartile and a third quartile in the raw GNSS position time series; determining a quartile distance corresponding to the original GNSS position time sequence according to the first quartile and the third quartile;
determining the gross error abnormal data based on the first quartile, the third quartile and the quartile distance;
the first quartile is data in the 25 th percent after data in the original GNSS position time sequence are sequenced from small to large; the third quartile is the data in 75% after the data in the original GNSS position time sequence are sequenced from small to large; the quartile distance is a difference between the third quartile and the first quartile.
4. The method of claim 3, wherein determining the gross error anomaly data based on the first quartile, the third quartile, and the quartile range comprises:
determining a first critical value based on the first quartile and the quartile range, and determining a second critical value based on the third quartile; the first critical value is not greater than the second critical value;
determining the gross error anomaly data based on the first critical value and the second critical value; wherein the gross error anomaly data is data in the raw GNSS location time series that is not between the first threshold and the second threshold.
5. The method of claim 1, wherein the performing local anomaly detection processing on the second GNSS position time series based on preset parameters to determine a local anomaly factor of the first data in the second GNSS position time series comprises:
determining a kth distance neighborhood of the first data based on a preset parameter; k is the preset parameter;
determining a plurality of second data within the k-th distance neighborhood; the second data is any data in the second GNSS position time series;
determining local reachable density corresponding to the first data and determining local reachable density corresponding to each second data;
and determining a local abnormal factor of the first data based on the local reachable density corresponding to the first data and the local reachable density corresponding to each second data.
6. The method of claim 5, wherein determining the local reachable density to which the first data corresponds comprises:
determining the reachable distance of the first data and each second data;
determining the quotient of the sum of each of the reachable distances and the number of second data in the k-th distance neighborhood; and the reciprocal of the quotient of the sum of each reachable distance and the number of second data in the k-th distance neighborhood is the local reachable density corresponding to the first data.
7. The method of claim 5, wherein determining the local anomaly factor for the first data based on the local reachable density for the first data and the local reachable density for each of the second data comprises:
determining the sum of the ratio of the local reachable density corresponding to each second data to the local reachable density corresponding to the first data;
obtaining the ratio of the sum of the local reachable density ratios to the number of second data in the k-th distance neighborhood; the ratio is a local anomaly factor of the first data.
8. The method of claim 1, wherein determining local anomaly data in the second GNSS location time series based on the local anomaly factor for the first data comprises:
when the original GNSS position time sequence meets a preset condition, determining first data corresponding to N local abnormal factors; the first data corresponding to the N local anomaly factors are local anomaly data corresponding to the second GNSS position time sequence; wherein each local anomaly factor of the N local anomaly factors is not less than the local anomaly factor corresponding to other data in the second GNSS location time series; n is a positive integer and is less than the number of data contained in the second GNSS location time series;
alternatively, the determining local anomaly data in the second GNSS position time series based on the local anomaly factor of the first data comprises:
when the original GNSS position time sequence does not meet the preset condition, determining the number of local abnormal factors to be acquired according to a set percentage;
determining first data corresponding to the local abnormal factor; the number of the first data corresponds to the number of the local anomaly factors; the number of first data is local anomaly data in the second GNSS position time series; each local anomaly factor is not smaller than the local anomaly factor corresponding to other data in the second GNSS position time sequence; the set percentage is not greater than 5%.
9. The method of claim 1, further comprising:
removing the local abnormal data in the second GNSS position time sequence to obtain a third GNSS position time sequence;
and carrying out interpolation processing on the third GNSS position time sequence to obtain a target GNSS position time sequence.
10. An apparatus for anomaly detection of a global navigation satellite system GNSS position time series, the apparatus comprising: a first obtaining unit, a second obtaining unit, a first determining unit, and a second determining unit, wherein,
the first obtaining unit is configured to perform gross error detection processing on the original GNSS position time sequence to obtain a first GNSS position time sequence; the original GNSS position time sequence is a group of data which is transmitted by a GNSS observation station receiving satellite and at least comprises position information and time information;
the second obtaining unit is configured to perform interpolation processing on the first GNSS position time series to obtain a second GNSS position time series;
the first determining unit is configured to perform local anomaly detection processing on the second GNSS position time series based on preset parameters, and determine a local anomaly factor of first data in the second GNSS position time series; the first data is any data in the second GNSS position time series;
the second determining unit is configured to determine local anomaly data in the second GNSS position time series based on the local anomaly factor of the first data.
11. A computer-readable storage medium, characterized in that the readable storage medium has stored thereon a computer program; the computer program when executed by a processor implements the steps of the method of any one of claims 1 to 9.
12. An abnormality detection device characterized by comprising: a processor and a memory for storing a computer program operable on the processor, wherein the processor is operable to perform the steps of the method of any of claims 1 to 9 when the computer program is executed.
CN202110373143.7A 2021-04-07 2021-04-07 Position time series abnormity detection method and device and computer storage medium Pending CN113219499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110373143.7A CN113219499A (en) 2021-04-07 2021-04-07 Position time series abnormity detection method and device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110373143.7A CN113219499A (en) 2021-04-07 2021-04-07 Position time series abnormity detection method and device and computer storage medium

Publications (1)

Publication Number Publication Date
CN113219499A true CN113219499A (en) 2021-08-06

Family

ID=77086592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110373143.7A Pending CN113219499A (en) 2021-04-07 2021-04-07 Position time series abnormity detection method and device and computer storage medium

Country Status (1)

Country Link
CN (1) CN113219499A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117471502A (en) * 2023-10-31 2024-01-30 北京华云星地通科技有限公司 Positioning source parameter anomaly detection and correction method, system and electronic equipment

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102014107945A1 (en) * 2013-06-12 2014-12-18 Samsung Electronics Co., Ltd. Receiver for simultaneous reception of signals from multiple GNSS satellite systems
CN104504901A (en) * 2014-12-29 2015-04-08 浙江银江研究院有限公司 Multidimensional data based detecting method of traffic abnormal spots
CN106096324A (en) * 2016-08-26 2016-11-09 清华大学 The power transmission and transformation main equipment load data disappearance returned based on k neighbour fills up algorithm
CN108802776A (en) * 2018-07-02 2018-11-13 武汉蓝泰源信息技术有限公司 Public transport GPS method for correcting error based on abnormity point elimination and trace compression algorithm
CN109086291A (en) * 2018-06-09 2018-12-25 西安电子科技大学 A kind of parallel method for detecting abnormality and system based on MapReduce
CN109117864A (en) * 2018-07-13 2019-01-01 华南理工大学 Coronary heart disease risk prediction technique, model and system based on heterogeneous characteristic fusion
CN109684601A (en) * 2018-11-23 2019-04-26 河海大学常州校区 A kind of air quality data restoration methods based on low-rank matrix completion
US20190187296A1 (en) * 2017-12-19 2019-06-20 Pusan National University Industry-University Cooperation Foundation Method and system for processing trajectory data
CN110207827A (en) * 2019-05-23 2019-09-06 浙江大学 A kind of electrical equipment temperature real time early warning method extracted based on Outlier factor
CN112612822A (en) * 2020-12-11 2021-04-06 中铁第四勘察设计院集团有限公司 Beidou coordinate time series prediction method, device, equipment and storage medium
US20210224599A1 (en) * 2018-07-04 2021-07-22 Hitachi, Ltd. Anomaly Detection System

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102014107945A1 (en) * 2013-06-12 2014-12-18 Samsung Electronics Co., Ltd. Receiver for simultaneous reception of signals from multiple GNSS satellite systems
CN104504901A (en) * 2014-12-29 2015-04-08 浙江银江研究院有限公司 Multidimensional data based detecting method of traffic abnormal spots
CN106096324A (en) * 2016-08-26 2016-11-09 清华大学 The power transmission and transformation main equipment load data disappearance returned based on k neighbour fills up algorithm
US20190187296A1 (en) * 2017-12-19 2019-06-20 Pusan National University Industry-University Cooperation Foundation Method and system for processing trajectory data
CN109086291A (en) * 2018-06-09 2018-12-25 西安电子科技大学 A kind of parallel method for detecting abnormality and system based on MapReduce
CN108802776A (en) * 2018-07-02 2018-11-13 武汉蓝泰源信息技术有限公司 Public transport GPS method for correcting error based on abnormity point elimination and trace compression algorithm
US20210224599A1 (en) * 2018-07-04 2021-07-22 Hitachi, Ltd. Anomaly Detection System
CN109117864A (en) * 2018-07-13 2019-01-01 华南理工大学 Coronary heart disease risk prediction technique, model and system based on heterogeneous characteristic fusion
CN109684601A (en) * 2018-11-23 2019-04-26 河海大学常州校区 A kind of air quality data restoration methods based on low-rank matrix completion
CN110207827A (en) * 2019-05-23 2019-09-06 浙江大学 A kind of electrical equipment temperature real time early warning method extracted based on Outlier factor
CN112612822A (en) * 2020-12-11 2021-04-06 中铁第四勘察设计院集团有限公司 Beidou coordinate time series prediction method, device, equipment and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
张恒璟: "基于GPS高程时间序列粗差的抗差探测与插补研究", 大地测量与地球动力学, no. 04, pages 71 - 75 *
明锋;曾安敏;景一帆;: "L1范数与IQR统计量组合的GNSS坐标序列粗差探测算法", 测绘科学技术学报, no. 02 *
王琰;张倩倩;车通宇;刘永;: "时间序列异常值探测的Bayes方法及其GNSS数据质量控制中的应用", 中国惯性技术学报, no. 01 *
邹建成: "基于非监督算法的卫星异常检测", 中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑, no. 03, pages 031 - 117 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117471502A (en) * 2023-10-31 2024-01-30 北京华云星地通科技有限公司 Positioning source parameter anomaly detection and correction method, system and electronic equipment
CN117471502B (en) * 2023-10-31 2024-04-02 北京华云星地通科技有限公司 Positioning source parameter anomaly detection and correction method, system and electronic equipment

Similar Documents

Publication Publication Date Title
Crombecq et al. Space-filling sequential design strategies for adaptive surrogate modelling
JP6501982B2 (en) Failure risk index estimation device and failure risk index estimation method
CN110969649A (en) Matching evaluation method, medium, terminal and device of laser point cloud and map
CN113219499A (en) Position time series abnormity detection method and device and computer storage medium
CN115457492A (en) Target detection method and device, computer equipment and storage medium
CN112825199A (en) Collision detection method, device, equipment and storage medium
CN114049530A (en) Hybrid precision neural network quantization method, device and equipment
US7369974B2 (en) Polynomial generation method for circuit modeling
CN115292971B (en) Bayes-based crack attribute analysis method and device and storage medium
Rieke et al. Improved statistical test for nonstationarity using recurrence time statistics
Hassani et al. A performance based model-set design strategy for multiple model adaptive estimation
US8352901B2 (en) Technique for generation of load-slew indices for circuit characterization
CN112581500A (en) Method and device for matching pedestrians and human faces in target tracking
CN114449439A (en) Method and device for positioning underground pipe gallery space
CN109425815B (en) Apparatus and method for predicting characteristics of semiconductor device
Ardeshiri et al. Gaussian mixture reduction using reverse Kullback-Leibler divergence
CN113051854A (en) Self-adaptive learning type power modeling method and system based on hardware structure perception
US10545262B2 (en) Method for determining a proportion cube
CN117609928B (en) Pollutant emission amount anomaly identification method and device based on power data
CN117424606B (en) Waveform data compression method, device, electronic device and storage medium
CN116152277B (en) Map segmentation method and device, electronic equipment and medium
CN115545191B (en) Current noise reduction network model training method and fault current limiter current noise reduction method
CN114861333B (en) Method, device, equipment and medium for acquiring parameters of bathtub curve of automobile reliability
CN114792115B (en) Telemetry signal outlier removing method, device and medium based on deconvolution reconstruction network
US20230401691A1 (en) Image defect detection method, electronic device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination