CN114492680B - Buoy data quality control method and device, computer equipment and storage medium - Google Patents

Buoy data quality control method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114492680B
CN114492680B CN202210405927.8A CN202210405927A CN114492680B CN 114492680 B CN114492680 B CN 114492680B CN 202210405927 A CN202210405927 A CN 202210405927A CN 114492680 B CN114492680 B CN 114492680B
Authority
CN
China
Prior art keywords
observation
data
buoy
sequence
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210405927.8A
Other languages
Chinese (zh)
Other versions
CN114492680A (en
Inventor
王斌
李硕
党超群
李亚文
吴宝勤
朱先德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Ocean Technology Center
Original Assignee
National Ocean Technology Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Ocean Technology Center filed Critical National Ocean Technology Center
Priority to CN202210405927.8A priority Critical patent/CN114492680B/en
Publication of CN114492680A publication Critical patent/CN114492680A/en
Application granted granted Critical
Publication of CN114492680B publication Critical patent/CN114492680B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The application relates to a buoy data quality control method, a buoy data quality control device, computer equipment and a storage medium. The method comprises the steps of obtaining an initial buoy observation sequence of a drift buoy; carrying out position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and rejecting the first buoy observation data to obtain a target buoy observation sequence; and identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence. And (3) eliminating invalid observation data in the acquired buoy observation sequence from the data integral layer by adding land position inspection around the sea-air interface scene to ensure that the observation data really reflects the conditions of different elements in the sea-air interface scene, and then performing targeted quality control on the observation data of different variable types by adopting respective applicable abnormal data identification modes to improve the accuracy and reliability of the buoy observation data.

Description

Buoy data quality control method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for controlling quality of buoy data, a computer device, and a storage medium.
Background
In the ocean science, buoy observation data including key meteorological factors and hydrological factors of a sea-air interface are acquired through a drifting buoy, so that observation data support is provided for the deep exploration of an ocean dynamic process, and the method has important significance in the research fields of sea-gas interaction research, disaster forecast early warning, ocean environment guarantee and the like. However, in the actual process of acquiring the buoy observation data, manual operation errors, environmental influences, unstable communication transmission and the like can interfere with the buoy observation result to cause data abnormality, so that the field observation data mostly have quality problems and cannot be directly put into application, and the quality control needs to be performed on the field observation data to remove error values and abnormal values. However, the observation data obtained by synchronously and comprehensively observing the key meteorological factors and the hydrological factors of the sea-air interface is complex, the correct observation data is often determined to be abnormal, and the abnormal data is difficult to accurately identify.
Disclosure of Invention
In view of the above, it is necessary to provide a method, an apparatus, a computer device and a storage medium for controlling the quality of buoy data, so as to accurately identify abnormal data and improve the accuracy and reliability of buoy observation data.
In a first aspect, the present application provides a method for controlling buoy data quality, including:
acquiring an initial buoy observation sequence of the drift type buoy, wherein the initial buoy observation sequence comprises observation data of different variable types;
performing position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and rejecting the first buoy observation data to obtain a target buoy observation sequence;
and identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence.
In some embodiments of the present application, performing location detection on an initial buoy observation sequence, and acquiring first buoy observation data whose data observation locations are terrestrial locations, includes:
acquiring swing angle amplitude data corresponding to buoy observation data in an initial buoy observation sequence;
and if the amplitude data of the swing angle is zero, determining the buoy observation data as first buoy observation data.
In some embodiments of the present application, before identifying first abnormal data in a target buoy observation sequence according to an abnormal data identification manner and the target buoy observation sequence corresponding to each variable type, the method further includes:
and based on the observation time of the target buoy observation sequence, carrying out repeated inspection and time increment inspection on the target buoy observation sequence to obtain the target buoy observation sequence ordered according to the observation time.
In some embodiments of the present application, before identifying first abnormal data in a target buoy observation sequence according to an abnormal data identification manner corresponding to each variable type and the target buoy observation sequence, the method further includes:
acquiring a data value range of each variable type;
and screening second abnormal data with values exceeding the data value range of the variable types from the observation data corresponding to the variable types in the target buoy observation sequence.
In some embodiments of the present application, after identifying first abnormal data in a target buoy observation sequence according to an abnormal data identification manner and the target buoy observation sequence corresponding to each variable type, the method further includes:
determining the first abnormal data and the second abnormal data as target abnormal data;
dividing the target abnormal data into continuous target abnormal data and single target abnormal data according to the observation time of the target abnormal data;
if the target abnormal data is single target abnormal data, performing interpolation processing on the target abnormal data;
and if the target abnormal data are continuous target abnormal data, rejecting the target abnormal data.
In some embodiments of the present application, the variable types include position coordinate variables, hydrological observation variables, and meteorological observation variables;
identifying first abnormal data in a target buoy observation sequence according to an abnormal data identification mode corresponding to each variable type and the target buoy observation sequence, and the method comprises the following steps of:
identifying first abnormal data of a position observation sequence corresponding to the position coordinate variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the position coordinate variable;
identifying first abnormal data of a hydrological observation sequence corresponding to the hydrological observation variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the hydrological observation variable;
and identifying first abnormal data of the meteorological observation sequence corresponding to the meteorological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the meteorological observation variable.
In some embodiments of the present application, identifying first abnormal data of a meteorological observation sequence corresponding to a meteorological observation variable in a target buoy observation sequence according to an abnormal data identification manner corresponding to the meteorological observation variable includes:
filtering the meteorological observation sequence to obtain primary-screened meteorological abnormal data;
determining first abnormal data corresponding to meteorological observation variables from the primary screened meteorological abnormal data according to a difference value between the primary screened meteorological abnormal data and reference meteorological observation data corresponding to the primary screened meteorological abnormal data; and the reference meteorological observation data and the primary screened meteorological abnormal data are meteorological observation data adjacent to observation time in a meteorological observation sequence.
In a second aspect, the present application provides a buoy data quality control device, comprising:
the buoy data acquisition module is used for acquiring an initial buoy observation sequence of the drifting buoy, and the initial buoy observation sequence comprises observation data of different variable types;
the observation data removing module is used for carrying out position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and removing the first buoy observation data to obtain a target buoy observation sequence;
and the abnormal data identification module is used for identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence.
In a third aspect, the present application further provides a computer device, including: one or more processors; a memory; and one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the buoy data quality control method.
In a fourth aspect, the present application also provides a computer readable storage medium having a computer program stored thereon, the computer program being loaded by a processor to perform the steps of the method for buoy data quality control.
In a fifth aspect, embodiments of the present application provide a computer program product or a computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided by the first aspect.
The method, the device, the computer equipment and the storage medium for controlling the quality of the buoy data acquire an initial buoy observation sequence of the drifting buoy, wherein the initial buoy observation sequence comprises observation data of different variable types; carrying out position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and rejecting the first buoy observation data to obtain a target buoy observation sequence; and identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence. And (3) eliminating invalid observation data in the acquired buoy observation sequence from the data integral layer by adding land position inspection around the sea-air interface scene to ensure that the observation data really reflects the conditions of different elements in the sea-air interface scene, and then performing targeted quality control on the observation data of different variable types by adopting respective applicable abnormal data identification modes to identify the abnormal data so as to improve the accuracy and reliability of the buoy observation data.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic view of a scenario of a quality control method for buoy data in an embodiment of the present application;
FIG. 2 is a schematic flow chart of a method for controlling buoy data quality in an embodiment of the present application;
FIG. 3 is a schematic flow chart of the step of acquiring the primary screening meteorological anomaly data in the embodiment of the present application;
FIG. 4 is a schematic flow chart diagram of another method for controlling buoy data quality in an embodiment of the present application;
FIG. 5 is a schematic diagram of the structure of a float data quality control device in an embodiment of the present application;
fig. 6 is a schematic structural diagram of a computer device in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description of the present application, it is to be understood that the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or as implying that the number of indicated technical features is indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.
In the description of the present application, the word "for example" is used to mean "serving as an example, instance, or illustration". Any embodiment described herein as "for example" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the invention. In the following description, details are set forth for the purpose of explanation. It will be apparent to one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and processes are not set forth in detail in order to avoid obscuring the description of the present invention with unnecessary detail. Thus, the present invention is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
In the embodiment of the present application, it should be noted that the method for controlling the quality of buoy data provided in the embodiment of the present application may be applied to an abnormal data processing system as shown in fig. 1. The abnormal data processing system includes a terminal 100 and a server 200, wherein the terminal 100 may be a floating buoy. The server 200 may be an independent server, or may be a server network or a server cluster composed of servers, which includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets, or a cloud server composed of a plurality of servers. Among them, the Cloud server is constituted by a large number of computers or web servers based on Cloud Computing (Cloud Computing).
Those skilled in the art will appreciate that the application environment shown in fig. 1 is only one application scenario related to the present application, and does not constitute a limitation on the application scenario of the present application, and that other application environments may further include more or less computer devices than those shown in fig. 1, for example, only 1 server 200 is shown in fig. 1, and it is understood that the exception data processing system may further include one or more other servers, which are not limited herein. In addition, as shown in fig. 1, the abnormal data processing system may further include a memory for storing data, such as buoy observation data.
It should be further noted that the scenario diagram of the abnormal data processing system shown in fig. 1 is merely an example, and the abnormal data processing system and the scenario described in the embodiment of the present invention are for more clearly explaining the technical solution of the embodiment of the present invention, and do not form a limitation on the technical solution provided in the embodiment of the present invention.
Referring to fig. 2, an embodiment of the present application provides a method for controlling buoy data quality, which is mainly illustrated by applying the method to the server 200 in fig. 1, and the method includes steps S210 to S230, which are as follows:
s210, obtaining an initial buoy observation sequence of the drifting buoy, wherein the initial buoy observation sequence comprises observation data of different variable types.
The initial Buoy observation sequence includes a plurality of Buoy observation data acquired by the drift Buoy at different observation times, for example, the initial Buoy observation sequence may be acquired by a drift Air-sea Interface Buoy (DrIB).
The observation data of different variable types refers to information data acquired by the drift buoy under different attribute types. It is understood that in the initial buoy observation sequence, the buoy observation data acquired at any one observation time includes observation data corresponding to a plurality of different variable categories, for example, may include observation data corresponding to a position coordinate variable, observation data corresponding to a hydrological observation variable, and observation data corresponding to a meteorological observation variable; the observation data corresponding to the position coordinate variables comprise longitude data and latitude data, the observation data corresponding to the hydrological observation variables comprise Sea Surface Temperature data (SST), and the observation data corresponding to the meteorological observation variables comprise air Temperature data, air pressure data, wind speed data, wind direction data, relative humidity data and the like.
S220, position detection is carried out on the initial buoy observation sequence, first buoy observation data with the data observation position as the land position are obtained, and the first buoy observation data are removed to obtain a target buoy observation sequence.
The first buoy observation data is buoy observation data with a data observation position as a land position in an initial buoy observation sequence. Different from a buoy observation sequence of an offshore fixed-point buoy, due to reasons such as ship copy, a drifting buoy usually starts to operate before entering water and sends buoy observation data, and the buoy observation data of the portion often interferes with subsequent analysis and research of the buoy observation data.
After the buoy observation sequence is obtained, position detection can be carried out on the initial buoy observation sequence, first buoy observation data with the data observation position being the land position are obtained, and the first buoy observation data are removed from the initial buoy observation sequence. Invalid observation data of the stage that the buoy does not enter the water are removed through land position inspection, and the buoy observation sequence is ensured to reflect the real marine environment.
Since the floating buoy has a different motion attitude on land than in water, in one embodiment, the position detection of the initial buoy observation sequence and the acquisition of the first buoy observation data with the data observation position as the land position comprise: acquiring swing angle amplitude data corresponding to buoy observation data in an initial buoy observation sequence; and if the amplitude data of the swing angle is zero, determining the initial buoy observation sequence as first buoy observation data.
The swing angle amplitude data is used for reflecting the motion attitude of the drifting buoy, and can include a roll angle and a pitch angle of the drifting buoy when acquiring buoy observation data.
After the swing angle amplitude data corresponding to each buoy observation data in the initial buoy observation sequence is obtained, judging whether the roll angle and the pitch angle in the swing angle amplitude data of any buoy observation data are zero or not according to any buoy observation data; if the roll angle or the pitch angle in the swing angle amplitude data is not zero, the swing angle amplitude data of the buoy observation data conforms to the motion attitude in water, and the data observation position of the buoy observation data is a non-land position; and if the roll angle and the pitch angle in the swing angle amplitude data are both zero values, the swing angle amplitude data of the buoy observation data do not accord with the motion attitude in water, and the data observation position of the buoy observation data is a non-land position.
The working environment of the drifting buoy is judged by judging whether the swing angle amplitude data of the buoy observation data conform to the motion attitude of the drifting buoy in water or not, and the first buoy observation data with the data observation position being the land position is effectively removed.
Furthermore, after the first buoy observation data with the data observation position being the land position is obtained, the observation data corresponding to the hydrological observation variable in the first buoy observation data can be detected, whether fault difference exists between the observation data and the observation data corresponding to the hydrological observation variable in the buoy observation data except the first buoy observation data is detected, if the fault difference exists, the first buoy observation data is determined as the buoy observation data with the data observation position being the land position, secondary position detection on the initial buoy observation sequence is realized, and the identification accuracy of the buoy observation data with the data observation position being the land position is improved.
Furthermore, after the first buoy observation data with the data observation position being the land position is obtained, the land position observation data can be generated based on the first buoy observation data, so that the initial buoy observation data can be conveniently checked in the follow-up process.
And S230, identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence.
After the buoy observation data with the data observation position as the land position is removed, identifying first abnormal data in the target buoy observation sequence, specifically, performing individual abnormal data identification on observation data of each variable type to obtain the first abnormal data.
Furthermore, the data dimensions, the carried data information and the data continuity changes of the observation data of different variable types are different, so that the abnormal data identification can be performed by different abnormal data identification modes aiming at the observation data of different variable types. Specifically, for any variable type, an observation sequence of the variable type may be obtained from a target buoy observation sequence, and then abnormal data identification processing may be performed on the observation sequence of the variable type based on an abnormal data identification manner corresponding to the variable type.
In one embodiment, as shown in FIG. 3, the variable types include a position coordinate variable, a hydrological observation variable, and a meteorological observation variable; according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence, identifying first abnormal data in the target buoy observation sequence, and the method comprises the following steps:
s231, identifying first abnormal data of a position observation sequence corresponding to the position coordinate variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the position coordinate variable;
s232, identifying first abnormal data of the hydrological observation sequence corresponding to the hydrological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the hydrological observation variable;
and S233, identifying the first abnormal data of the meteorological observation sequence corresponding to the meteorological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the meteorological observation variable.
And aiming at the position coordinate variable, the position observation sequence comprises position observation data, such as longitude and latitude data, which are obtained by screening from the target buoy observation sequence and are related to the position coordinate variable at different observation times. It will be appreciated that the position observations in the sequence of position observations may be ordered in chronological order of the time of observation.
After the position observation sequence corresponding to the position coordinate variable is obtained, identifying first abnormal data in the position observation sequence according to an abnormal data identification mode corresponding to the position coordinate variable, wherein the first abnormal data in the position observation sequence can be screened out by using a peak inspection method. Specifically, sequentially taking position observation data in the position observation sequence as target position observation data, and acquiring reference position observation data corresponding to the target position observation data for any target position observation data, wherein the reference position observation data and the target position observation data are position observation data with adjacent observation time in the position observation sequence; judging whether the target position observation data is a peak value or not according to the target position observation data and the reference position observation data corresponding to the target position observation data; if the target position observation data are peaked values, determining the target position observation data as first abnormal data; and if the target position observation data are not sharp peak values, the target position observation data are normal data.
For example, the target location observation is a location observation
Figure 714923DEST_PATH_IMAGE001
The reference position observation data includes position observation data
Figure 132129DEST_PATH_IMAGE002
And position observation data
Figure 631243DEST_PATH_IMAGE003
Wherein, in the process,
Figure 133900DEST_PATH_IMAGE002
Figure 376662DEST_PATH_IMAGE001
Figure 913954DEST_PATH_IMAGE004
representing three position observation data continuous in observation time, the position observation data is judged by the following formula (1)
Figure 318391DEST_PATH_IMAGE001
Whether it is a sharp peak:
Figure 806879DEST_PATH_IMAGE005
(1)
wherein, α is a critical value coefficient, which can be specifically set according to specific situations; for example, when the data sampling interval of the position observation data is 1 hour, α takes 0.1.
And aiming at the hydrological observation variable, the hydrological observation sequence comprises hydrological observation data, such as sea surface temperature data and the like, which are obtained by screening from the target buoy observation sequence and are related to the hydrological observation variable at different observation time. It is to be understood that the hydrologic observations in the sequence of hydrologic observations may be ordered in chronological order of the time of observation.
After the hydrologic observation sequence corresponding to the hydrologic observation variable is obtained, identifying first abnormal data in the hydrologic observation sequence according to an abnormal data identification mode corresponding to the hydrologic observation variable, specifically, performing continuity peak inspection on the hydrologic observation sequence by obtaining a weighted average value of the hydrologic observation sequence and using the weighted average value.
For example by setting an iteratively varying weighted average
Figure 587753DEST_PATH_IMAGE006
As a "spike" detector, a continuity check is made on the hydrographic observation sequence. Wherein,
Figure 245130DEST_PATH_IMAGE006
a weighted average of the hydrological observation sequences over a preset length of observation period may be taken,
Figure 86047DEST_PATH_IMAGE006
the observation time of the floating buoy is different from the observation time of the hydrological observation data. Wherein,
Figure 297717DEST_PATH_IMAGE007
the value is taken as the first effective hydrologic observation data, the weight of the adjacent hydrologic observation numerical value of the hydrologic observation data to be judged whether to be abnormal data in observation time is set to be the maximum, and the judging process is as follows:
sequentially substituting hydrologic observation data in the hydrologic observation sequence into a formula (2) for testing:
Figure 616703DEST_PATH_IMAGE008
(2)
wherein,
Figure 659745DEST_PATH_IMAGE009
represents the (i + 1) th hydrological observation data;Δ is a determination threshold, and in one embodiment, Δ may be 0.5K, where K is an integer.
In addition, in the case of the present invention,
Figure 45465DEST_PATH_IMAGE010
wherein C represents a weight coefficient,
Figure 134644DEST_PATH_IMAGE011
Figure 132687DEST_PATH_IMAGE012
if the formula (2) is not satisfied, the hydrologic observation data is abnormal,
Figure 623711DEST_PATH_IMAGE006
and not updated.
Aiming at the meteorological observation variable, the meteorological observation sequence comprises meteorological observation data which are obtained by screening from the target buoy observation sequence and are related to the meteorological observation variable at different observation time. It is to be understood that the hydrologic observations in the sequence of meteorological observations may be ordered in chronological order of time of observation.
In one embodiment, identifying the first abnormal data of the meteorological observation sequence corresponding to the meteorological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the meteorological observation variable comprises: filtering the meteorological observation sequence to obtain primary-screened meteorological abnormal data; determining first abnormal data corresponding to meteorological observation variables from the primary screened meteorological abnormal data according to a difference value between the primary screened meteorological abnormal data and reference meteorological observation data corresponding to the primary screened meteorological abnormal data; and the reference meteorological observation data and the primary screened meteorological abnormal data are meteorological observation data adjacent to observation time in a meteorological observation sequence.
Specifically, considering that an atmospheric system is active and changes rapidly, and meteorological observation data of meteorological elements such as air temperature, air pressure, air speed and relative humidity change remarkably, firstly, meteorological abnormal data are preliminarily screened from the meteorological observation data through filtering processing in a mathematical statistical sense, then, first abnormal data of meteorological observation variables are determined from the preliminarily screened meteorological abnormal data in an actual physical sense, the quality control effect of the meteorological observation data is guaranteed in two aspects of the mathematical statistical sense and the actual physical process, and error values and abnormal values in a meteorological observation sequence are removed, so that the accuracy and the reliability of the meteorological observation data are improved.
The method comprises the steps of filtering a meteorological observation sequence to obtain primary-screened meteorological abnormal data, wherein the meteorological observation sequence can be divided into short-sequence observation data with preset length by using a time window; acquiring a median and a median absolute deviation corresponding to short sequence observation data in a time window; determining the value range of the observation data in the time window according to the median and the median absolute deviation; and determining the meteorological observation data beyond the value range of the observation data in the short-sequence observation data as primary screening meteorological abnormal data.
The length of the time window can be set according to actual conditions; for example, if the data sampling interval of the meteorological observation data in the meteorological observation sequence is 1 hour, the length of the time window may be set to 24 hours, and the obtained short sequence of observation data in the time window includes 24 meteorological observation data. Specifically, the time window is moved according to a certain data step length, for example, the time window can be moved sequentially according to the step length of one meteorological observation data each time until the last meteorological observation data of the meteorological observation sequence, so that the meteorological observation data in the meteorological observation sequence is divided into a plurality of short sequence observation data, and the deviation caused by the atmospheric environment change of different sea areas is effectively reduced under the conditions that the atmospheric environment changes rapidly and the fluctuation degree of the meteorological observation data is large. In addition, in a scene of a meteorological observation sequence obtained through a drift type ocean-air interface buoy, the meteorological observation sequence is segmented by using a time window, so that the drift observation time and the consideration of the continuity and regularity of the spatial change are added into the identification abnormal data, the deviation caused by the dynamic change of the spatial position and the difference of the atmospheric environments of different sea areas is reduced, and the accuracy of the identification abnormal data is ensured.
The median absolute deviation refers to the median of the difference between each meteorological observation data and the median in the short-sequence observation data. Specifically, for any short sequence observation data, a median and a median absolute deviation corresponding to the short sequence observation data can be calculated, and then the median and the median absolute deviation are used as judgment standards to determine the observation data value range of the short sequence observation data in the time window according to the median and the median absolute deviation.
The method has the advantages that the meteorological observation data in the meteorological observation sequence are segmented by using the time window, the median and the median absolute deviation are used as the judgment standard of the abnormal observation data aiming at the segmented short sequence observation data, the influence of the extreme outlier data in the meteorological observation sequence on data filtering is reduced, the condition that the correct meteorological observation data are judged to be abnormal is avoided, meanwhile, the drifting observation time and the consideration of the continuity and regularity of the spatial change are added, the deviation caused by the dynamic change of the spatial position or the difference of the atmospheric environments of different sea areas is effectively reduced, and the accuracy of identifying the abnormal data is improved.
The abnormal observation data obtained by filtering the meteorological observation data is abnormal data determined from the mathematical statistics significance level, when the atmospheric environment is relatively stable, the fluctuation degree of the meteorological observation data is not large, and secondary abnormal inspection can be carried out on the primary screened meteorological abnormal data in order to reduce misjudgment caused by filtering. Specifically, local anomaly detection can be performed on the primary screened meteorological abnormal data, and first anomaly data of meteorological observation variables can be determined from the primary screened meteorological abnormal data according to a difference value between the primary screened meteorological abnormal data and adjacent reference meteorological observation data.
The reference meteorological observation data is meteorological observation data which is adjacent to the primary screened meteorological abnormal data on the observation data; for example, the primary screening weather anomaly data is the weather observation data at the time t, and the reference weather observation data includes, but is not limited to, the weather observation data at the time (t + 1) and/or the weather observation data at the time (t-1). Specifically, according to a difference value between the primary screened meteorological abnormal data and the reference meteorological observation data corresponding to the primary screened meteorological abnormal data, the first abnormal data of the meteorological observation variable is determined from the primary screened meteorological abnormal data, any primary screened meteorological abnormal data can be sequentially determined as target observation data, and differential operation is performed on the target observation data and the reference meteorological observation data corresponding to the target observation data to obtain a difference value; and if the difference value is greater than the preset difference threshold value, determining the target observation data as first abnormal data of the meteorological observation variable. Wherein, predetermine the difference threshold value and can set up according to actual conditions, furtherly, for avoiding meteorological detection equipment self observation error to disturb the judgement, predetermine the difference threshold value and should be greater than meteorological detection equipment's observation error.
The method for controlling the quality of the buoy data obtains an initial buoy observation sequence of the drifting buoy, wherein the initial buoy observation sequence comprises observation data of different variable types; performing position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and rejecting the first buoy observation data to obtain a target buoy observation sequence; and identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence. And (3) eliminating invalid observation data in the acquired buoy observation sequence from the data integral layer by adding land position inspection around the sea-air interface scene to ensure that the observation data really reflects the conditions of different elements in the sea-air interface scene, and then performing targeted quality control on the observation data of different variable types by adopting respective applicable abnormal data identification modes to identify the abnormal data so as to improve the accuracy and reliability of the buoy observation data.
Before the first abnormal data in the target buoy observation sequence is identified according to the abnormal data identification mode and the target buoy observation sequence corresponding to each variable type, the target buoy observation sequence can be preprocessed, so that each buoy observation data in the target buoy observation sequence is arranged according to observation time, the continuity judgment of the observation data is facilitated subsequently, and the abnormal data obviously deviating from the overall or local change trend is identified. In one embodiment, before identifying the first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence, the method further includes: and based on the observation time of the target buoy observation sequence, carrying out repeated inspection and time increment inspection on the target buoy observation sequence to obtain the target buoy observation sequence ordered according to the observation time.
In order to prevent the occurrence of the time sequence error, time increasing inspection can be performed on the buoy observation data in the target buoy observation sequence to check whether the observation time (such as year, month, day, time, minute, second and the like) of the buoy observation data in the target buoy observation sequence is monotonically increased all the time, so as to ensure that the data time sequence is normal, and if the buoy observation data with disordered observation time sequence is detected, the part of the buoy observation data can be deleted, or the time sequence can be adjusted, so that the buoy observation data can be stored according to the observation time, and the target buoy observation sequence sequenced according to the observation time can be obtained.
In the operation process of the drift type buoy, if the communication transmission process is unstable or the data storage module fails, the problem that two or more pieces of observation data are stored at the same observation time may occur, which may cause data repetition errors. Therefore, repeated inspection can be carried out on the buoy observation data in the target buoy observation sequence, repeated buoy observation data are eliminated through repeated inspection, and one-to-one correspondence between observation time and observation results is guaranteed.
In addition, in consideration of observation indexes of instrument design, buoy observation data collected by the drift buoy is often within a certain measurement range, and therefore, in an embodiment, before identifying first abnormal data in a target buoy observation sequence according to an abnormal data identification manner corresponding to each variable type and the target buoy observation sequence, the method further includes: acquiring a data value range of each variable type; and screening second abnormal data with values exceeding the data value range of the variable types from the observation data corresponding to the variable types in the target buoy observation sequence.
The data value range of any variable type can be set according to the measurement capability of the drifting buoy on the variable type, for example, for the sea surface temperature in the hydrologic observation variable, the measurement range of the drifting buoy on the sea surface temperature is [ -2, 35], and then the data value range of the sea surface temperature in the hydrologic observation variable is [ -2, 35 ]; the normal variation range of the variable type may also be set, for example, the normal variation range is [0, 100] for the relative humidity in the meteorological observation variable, and the data value range for the relative humidity in the meteorological observation variable is set to [0, 100 ].
Specifically, by performing range check on the buoy observation data in the target buoy observation sequence, if the observation data corresponding to each variable type in the target buoy observation sequence is not in the corresponding data value range, the part of observation data can be marked as abnormal data, and then the part of abnormal data can be deleted or adjusted. For example, the measurement range of the drift type buoy for the air temperature observation data in the weather observation variable is [ -40, 60], that is, the data value range of the air temperature observation data in the weather observation variable is [ -40, 60], and if the air temperature value is recorded as 70 degrees in the buoy observation data collected at an observation time, the air temperature value in the buoy observation data can be marked as abnormal data and deleted.
By identifying abnormal data which exceed the observation capability of the drifting type buoy or the normal value range of different variable types, the accuracy and the reliability of buoy observation data are improved.
In one embodiment, after identifying the first abnormal data in the target buoy observation sequence according to the abnormal data identification mode and the target buoy observation sequence corresponding to each variable type, the method further includes: determining the first abnormal data and the second abnormal data as target abnormal data; dividing the target abnormal data into continuous target abnormal data and single target abnormal data according to the observation time of the target abnormal data; if the target abnormal data is single target abnormal data, performing interpolation processing on the target abnormal data; and if the target abnormal data are continuous target abnormal data, rejecting the target abnormal data.
The continuous target abnormal data refers to a plurality of meteorological observation data acquired in a continuous time on the time sequence information, and the single target abnormal data refers to one or two meteorological observation data on the time sequence information.
Specifically, in order to keep the continuity of the meteorological observation data as much as possible without destroying the statistical characteristics and the variation trend of the data, for single type target abnormal data, linear interpolation can be performed based on the meteorological observation data adjacent to the meteorological observation data in the previous and subsequent observation times, and the target abnormal data can be replaced by the result obtained by the linear interpolation; and the continuous target abnormal data is deleted.
In the following, the buoy data quality control method is further described by taking DrIB as an example for an air interface scene.
The DrIB is mainly used for disposable drifting observation, the buoy operation condition is recorded in real time, high-frequency observation data acquisition of key meteorological observation variables and hydrological observation variables of a sea-air interface with large range, gridding and high space-time resolution in a global sea area can be realized through the DrIB, observation data support can be provided for deep exploration of a marine power process, and the DrIB is of great significance in the research fields of sea-gas interaction research, disaster forecast early warning, marine environment guarantee and the like.
In the actual process of collecting buoy observation data, manual operation errors, influences of the offshore environment, unstable communication transmission and the like can interfere with observation results to cause data abnormity, and field observation data mostly have quality problems and cannot be directly put into application. Therefore, the quality of the acquired buoy observation data needs to be controlled first, and the error values and abnormal values hidden in the time series are removed, so that the accuracy and reliability of the observation data are improved. Specifically, as shown in fig. 4, after an initial buoy observation sequence is acquired through DrIB, the initial buoy observation sequence is preprocessed, where the preprocessing includes, but is not limited to, the following data-based quality control steps:
(1) and (4) land position inspection. Unlike marine fixed-point observation, due to vessel ship overtaking and the like, DrIB may start to operate before entering water, send land observation information, and directly cause observation data abnormality of hydrological observation variables such as sea surface temperature data. Therefore, position detection is firstly carried out on the initial buoy observation sequence, first buoy observation data with the data observation position as the land position are obtained, and the first buoy observation data are removed to obtain a target buoy observation sequence.
Specifically, the working environment can be accurately judged by judging whether the swing angle amplitude data corresponding to each buoy observation data in the initial buoy observation sequence is in accordance with the underwater motion attitude, the sea surface temperature and the stable working period numerical value, the land observation information is removed, and meanwhile, an independent land observation information file is generated, so that the later inspection and checking are facilitated.
(2) And (5) repeating the test. In the DrIB operation process, if the communication transmission process is unstable or the data storage module fails, a problem of storing two or more pieces of observation data at the same time may occur, resulting in data repetition errors. Therefore, repeated inspection is carried out on the target buoy observation sequence based on the observation time of the target buoy observation sequence, repeated buoy observation data are eliminated, and the one-to-one relation between the observation time and the observation result is ensured.
Furthermore, repeated data files can be generated based on repeated buoy observation data, and subsequent repeated disk inspection is facilitated.
(3) And (5) carrying out a time increment test. Under normal conditions, the DrIB observation data are arranged according to a time sequence, and in order to prevent the occurrence of the 'time backflow' condition, the time increasing inspection is carried out on the target buoy observation sequence based on the observation time of the target buoy observation sequence to obtain the target buoy observation sequence ordered according to the observation time;
specifically, the time sequence information such as year, month, day, hour, minute, second and the like is incrementally checked to check whether the time sequence information is monotonically increased all the time, so that the time sequence information of the observed data is ensured to be normal. If the observation data with disordered time sequence information is detected, deleting the observation data, adjusting the time sequence if necessary, and recording the deleting and adjusting actions of the log related to the time sequence in real time.
(4) And (5) checking the range. The range check is based on basic cognition of geographical knowledge and general rules of oceanographic meteorological elements and the observation capability of the buoy, whether the observation data are reasonable or not is effectively judged, and if the observation data are invalid, the observation data are marked as abnormal data. For example, the sea surface temperature generally varies from-4 ℃ to 44 ℃, the relative humidity does not exceed 100% at most, and the like, and the DrIB platform design observation indexes are shown in table 1.
Specifically, the data value range of each variable type can be obtained; and respectively screening second abnormal data with values exceeding the data value range of the variable types from the observation data corresponding to each variable type in the target buoy observation sequence. The data of the second abnormal data beyond the data value range is assumed to be a set W, wherein the data set of the second abnormal data corresponding to the position coordinate variable is a set Wg, the data set of the second abnormal data corresponding to the hydrological observation variable is a set Ws, and the data set of the second abnormal data corresponding to the meteorological observation variable is a set Wq.
TABLE 1 drift type design deviation index for buoy observation of sea-air interface
Figure 416217DEST_PATH_IMAGE013
After the basic quality control of the data is completed, obvious errors in the buoy observation sequence are eliminated.
The continuity and the gradient of a data sequence are determined by the working characteristics of DrIB wave-following flow-by-flow, and then data targeted quality control is carried out on buoy observation data in a target buoy observation sequence, specifically, the observation data are divided into position coordinate data, hydrological observation variables and meteorological observation variables, corresponding data continuity judgment criteria are respectively set, problem data violating the continuous trend of the sequence are found out, and targeted quality control is further achieved. The method comprises the following specific steps:
(1) a position coordinate variable. The position observation data corresponding to the position coordinate variable includes a longitude and a latitude.
Aiming at the condition that an abnormal mutation (peak) exists on a DrIB drift track, a peak detection method is used for judging whether position observation data on certain observation time is normal or not, and the judgment method is as follows through a formula (3):
Figure 399217DEST_PATH_IMAGE014
(3)
wherein,
Figure 528847DEST_PATH_IMAGE002
Figure 874378DEST_PATH_IMAGE001
Figure 437837DEST_PATH_IMAGE003
position observation data representing three consecutive positions in observation time; alpha is a critical value coefficient which can be set according to specific situations. For example, 0.1 is taken when the data sampling interval is 1 h. If the above formula (3) is satisfied, then
Figure 236029DEST_PATH_IMAGE001
And (6) abnormal.
The set of the position coordinate variables screened out by the peak detection method corresponding to the first abnormal data is a set G.
(2) Hydrologically observed variables. The hydrological observation data corresponding to the hydrological observation variable includes sea surface temperature data.
For a hydrologic observation sequence, the peak test and the continuity test idea are fused, and an iteratively changed weighted average value is set
Figure 841454DEST_PATH_IMAGE006
As a "spike" detector for hydrographic observations, a continuity check is made for the hydrographic observation sequence.
In particular, the amount of the solvent to be used,
Figure 41491DEST_PATH_IMAGE006
determining the sea surface temperature change of the drifting buoy in a preset observation time period, and giving the highest weight to an observation value corresponding to the adjacent moment of the detection data, wherein the determination process comprises the following steps:
recording the first valid sea surface temperature data as
Figure 706959DEST_PATH_IMAGE007
And sequentially substituting sea surface temperature data in the hydrological observation sequence into a formula (4) for testing:
Figure 992447DEST_PATH_IMAGE008
(4)
wherein,
Figure 401562DEST_PATH_IMAGE015
the data of the (i + 1) th sea surface temperature is represented, delta is a judgment threshold, specifically, the value of delta can be 0.5K, and K is an integer.
In addition, in the case of the present invention,
Figure 190527DEST_PATH_IMAGE016
wherein C represents a weight coefficient,
Figure 259852DEST_PATH_IMAGE017
Figure 32636DEST_PATH_IMAGE012
if the formula (4) is not satisfied, the sea surface temperature data is abnormal,
Figure 979863DEST_PATH_IMAGE006
and not updated.
For the sea surface temperature data of tropical pacific, the value of C is 1/4, the value of C is 0.5K, and K is an integer.
Compared with the relatively balanced and stable environment of the tropical sea area, the dynamic process of the medium latitude sea area is more complex, the nonlinear unstable interaction of the ocean gas is strong, the hydrological observation sequence is more dependent on the real change of the sea surface temperature data in the observation time period, and in order to improve the applicability, the peak detector for calculating the sea surface temperature data of the DrIB of the medium latitude sea area
Figure 888913DEST_PATH_IMAGE006
In time, the contribution weight of the sea surface temperature data at the current observation time can be improved; for example, when the value of the parameter C is 1/3, the value is 0.5 ℃.
In order to reduce the misjudgment probability when the drifting buoy passes through a strong frontal surface or the interval of observation time is large as much as possible, forward and backward two-way inspection is carried out on the hydrological observation sequence, and a set of first abnormal data which are considered to be abnormal in two directions is taken and recorded as a set S.
(3) And (4) meteorological observation variables. The meteorological observation data corresponding to the meteorological observation variables at least comprise air temperature data, air pressure data, wind speed data and relative humidity data.
The atmospheric system is more active and changes rapidly, and the development rule characteristics of key meteorological elements such as air temperature, air pressure, wind speed, relative humidity and the like are more obvious. When the peak inspection method is applied to detecting meteorological observation data, the situation that a large amount of correct data is judged to be abnormal due to the fact that the difference between the meteorological observation data and the judgment standard data is larger and larger is found, and the method cannot be directly applied.
Statistically, it is theorized that meteorological variables observed continuously at fixed points tend to follow a certain probability distribution, and therefore data can be examined using statistical test methods, such as the reinhardia method (3 δ criterion), and data with a residual error from the arithmetic mean exceeding three times the standard deviation can be considered as outliers. However, when the arithmetic mean value and the standard deviation are used as the judgment standard, the result deviation is easily caused by extreme outlier data; in addition, the spatial position of the DrIB is dynamically changed, and the atmospheric environments of different sea areas are greatly different, so that the significance of the statistical characteristic research of a long-time observation sequence is small. Therefore, based on the continuity and regularity of the drift observation time and space change, the filtering processing can be carried out on the meteorological observation data by using the Hampel filter as a time window unit to judge the degree of the meteorological observation data departing from the daily variability, the preliminarily screened meteorological abnormal data is obtained, and the interference of extreme values in a meteorological observation sequence on the identification of the abnormal data is avoided.
Specifically, assume that the meteorological observation sequence is
Figure 161763DEST_PATH_IMAGE018
The data sampling interval is 1h, the time window is set to one day, and the data can be obtained
Figure 890684DEST_PATH_IMAGE019
Is a set of short-sequence observations,
Figure 641603DEST_PATH_IMAGE020
and obtaining a plurality of groups of short sequence observation data by analogy with the group of short sequence observation data.
To be provided with
Figure 405159DEST_PATH_IMAGE021
Taking the group of short sequence observation data as an example, calculating the median and the median absolute deviation of the group of short sequence observation data; wherein the median is
Figure 81866DEST_PATH_IMAGE022
The absolute deviation of the median is the MAD,
Figure 563663DEST_PATH_IMAGE023
Figure 852693DEST_PATH_IMAGE024
the length of the meteorological observation data can be determined according to the median absolute deviation
Figure 736335DEST_PATH_IMAGE025
Figure 85408DEST_PATH_IMAGE026
The value range of the observed data in the time window is
Figure 726605DEST_PATH_IMAGE027
I.e. as the median distance of meteorological observation data
Figure 943960DEST_PATH_IMAGE028
Over 3 times
Figure 56010DEST_PATH_IMAGE029
And if so, the meteorological observation data is abnormal observation data.
And filtering the meteorological observation data to obtain primary-screened meteorological abnormal data which is recorded as a set Q1.
It can be understood that the primary screening meteorological abnormal data Q1 is abnormal data determined from the aspect of mathematical statistics significance, if the atmospheric environment is relatively stable, the fluctuation degree of observation parameters is balanced, and the local conversion MAD value is relatively small, which easily causes the condition that correct data is wrongly judged as abnormal.
In order to avoid the misjudgment risk, local anomaly detection is further introduced, and any meteorological observation data Q in the set Q1 are subjected to detectionsIf the difference operation result of the observation data adjacent to any observation time does not exceed the preset difference threshold value, q issIf the difference operation result of the meteorological observation data which is misjudged as abnormal data and the observation data adjacent to any observation time exceeds a preset difference threshold value, q issAbnormal meteorological observation data. The meteorological observation data set misjudged as abnormal data is set as a set Q2. In order to avoid the interference judgment of the observation error of the instrument, the preset difference threshold value is set to be a value larger than the buoy design observation error.
With the above-described float data quality control method, it is detected that the abnormal data corresponding to the position coordinate variables includes the data of the set Wg and the data of the set G, the abnormal data corresponding to the hydrological observation variables includes the data of the set Ws and the data of the set S, and the abnormal data corresponding to the meteorological observation variables includes all the data of the set Wq plus the set Q1, and the data remaining after the data of the set Q2 is removed. In order to keep the continuity of the observed data as much as possible on the basis of not damaging the statistical characteristics and the variation trend of the data, for single abnormal data or two abnormal data, the result of linear interpolation of the observed data adjacent to the observed data in the previous and later observation time is used for substitution; and deleting the wind direction abnormal data and the abnormal data with more than two continuous variables.
In order to better implement the method for controlling quality of buoy data provided in the embodiment of the present application, on the basis of the method for controlling quality of buoy data provided in the embodiment of the present application, an apparatus for controlling quality of buoy data is further provided in the embodiment of the present application, as shown in fig. 5, the apparatus 500 for controlling quality of buoy data includes:
a buoy data obtaining module 510, configured to obtain an initial buoy observation sequence of the drifting buoy, where the initial buoy observation sequence includes observation data of different variable types;
an observation data removing module 520, configured to perform position detection on the initial buoy observation sequence, obtain first buoy observation data with a data observation position being a land position, and remove the first buoy observation data to obtain a target buoy observation sequence;
the abnormal data identification module 530 is configured to identify first abnormal data in the target buoy observation sequence according to the abnormal data identification manner corresponding to each variable type and the target buoy observation sequence.
In some embodiments of the present application, the observation data removing module is specifically configured to obtain swing angle amplitude data corresponding to buoy observation data in an initial buoy observation sequence; and if the amplitude data of the swing angle is zero, determining the buoy observation data as first buoy observation data.
In some embodiments of the present application, the abnormal data identification module is further specifically configured to perform a repeat check and a time increment check on the target buoy observation sequence based on the observation time of the target buoy observation sequence, so as to obtain the target buoy observation sequence ordered according to the observation time.
In some embodiments of the present application, the abnormal data identification module is specifically further configured to obtain a data value range of each variable type; and respectively screening second abnormal data with values exceeding the data value range of the variable types from the observation data corresponding to each variable type in the target buoy observation sequence.
In some embodiments of the present application, the buoy data quality control device further includes an abnormal data processing module, configured to determine the first abnormal data and the second abnormal data as target abnormal data; dividing the target abnormal data into continuous target abnormal data and single target abnormal data according to the observation time of the target abnormal data; if the target abnormal data is single target abnormal data, performing interpolation processing on the target abnormal data; and if the target abnormal data are continuous target abnormal data, rejecting the target abnormal data.
In some embodiments of the present application, the variable types include position coordinate variables, hydrological observation variables, and meteorological observation variables; the abnormal data identification module is specifically used for identifying first abnormal data of a position observation sequence corresponding to the position coordinate variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the position coordinate variable; identifying first abnormal data of a hydrological observation sequence corresponding to the hydrological observation variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the hydrological observation variable; and identifying first abnormal data of the meteorological observation sequence corresponding to the meteorological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the meteorological observation variable.
In some embodiments of the present application, the method is specifically configured to perform filtering processing on a meteorological observation sequence to obtain preliminary screening meteorological anomaly data; determining first abnormal data corresponding to meteorological observation variables from the primary screened meteorological abnormal data according to a difference value between the primary screened meteorological abnormal data and reference meteorological observation data corresponding to the primary screened meteorological abnormal data; and the reference meteorological observation data and the preliminary screening meteorological abnormal data are meteorological observation data with adjacent observation time in a meteorological observation sequence.
The buoy data quality control device acquires an initial buoy observation sequence of the drift buoy, wherein the initial buoy observation sequence comprises observation data of different variable types; performing position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and rejecting the first buoy observation data to obtain a target buoy observation sequence; and identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence. And (3) eliminating invalid observation data in the acquired buoy observation sequence from the data integral layer by adding land position inspection around the sea-air interface scene to ensure that the observation data really reflects the conditions of different elements in the sea-air interface scene, and then performing targeted quality control on the observation data of different variable types by adopting respective applicable abnormal data identification modes to identify the abnormal data so as to improve the accuracy and reliability of the buoy observation data.
In some embodiments of the present application, the buoy data quality control device 500 may be implemented in the form of a computer program that is executable on a computer device such as that shown in fig. 6. The memory of the computer device may store various program modules constituting the float data quality control apparatus 500, such as the float data acquisition module 510, the observation data culling module 520, and the abnormal data identification module 530 shown in fig. 5. The program modules constitute computer programs that cause the processor to execute the steps of the float data quality control method according to the embodiments of the present application described in the present specification.
For example, the computer device shown in fig. 6 may execute step S210 by the buoy data acquisition module 510 in the buoy data quality control apparatus 500 shown in fig. 5. The computer device may perform step S220 through the observation data culling module 520. The computer device may perform step S230 through the abnormal data recognition module 530. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external computer device through a network connection. The computer program is executed by a processor to implement a method of float data quality control.
It will be appreciated by those skilled in the art that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In some embodiments of the present application, a computer device is provided that includes one or more processors; a memory; and one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to perform the steps of the above-described buoy data quality control method. Here, the steps of the float data quality control method may be steps in the float data quality control method of each of the above embodiments.
In some embodiments of the present application, a computer-readable storage medium is provided, in which a computer program is stored, which computer program is loaded by a processor, so that the processor performs the steps of the above-mentioned buoy data quality control method. Here, the steps of the float data quality control method may be steps in the float data quality control method of each of the above embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware that is related to instructions of a computer program, where the computer program may be stored in a non-volatile computer-readable storage medium, and when executed, the computer program may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The method, the apparatus, the computer device and the storage medium for controlling the quality of the buoy data provided in the embodiments of the present application are described in detail above, and specific examples are applied herein to explain the principles and embodiments of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for controlling quality of buoy data, comprising:
acquiring an initial buoy observation sequence of a drift type buoy, wherein the initial buoy observation sequence comprises observation data of different variable types;
performing position detection on the initial buoy observation sequence, acquiring first buoy observation data with a data observation position as a land position, and rejecting the first buoy observation data to obtain a target buoy observation sequence;
identifying first abnormal data in the target buoy observation sequence according to an abnormal data identification mode corresponding to each variable type and the target buoy observation sequence;
the performing position detection on the initial buoy observation sequence to obtain first buoy observation data with a data observation position as a land position includes:
acquiring swing angle amplitude data corresponding to the buoy observation data in the initial buoy observation sequence;
if the amplitude data of the swing angle is zero, determining the buoy observation data as first buoy observation data;
after first buoy observation data with a data observation position as a land position are obtained, detecting observation data corresponding to hydrological observation variables in the first buoy observation data, detecting whether fault differences exist between the observation data and observation data corresponding to hydrological observation variables in the buoy observation data except the first buoy observation data, and if the fault differences exist, determining the first buoy observation data as the buoy observation data with the data observation position as the land position, so that secondary position detection is carried out on an initial buoy observation sequence.
2. The method of claim 1, wherein before identifying the first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence, the method further comprises:
and based on the observation time of the target buoy observation sequence, carrying out repeated inspection and time increment inspection on the target buoy observation sequence to obtain the target buoy observation sequence ordered according to the observation time.
3. The method of claim 1, wherein before identifying the first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence, the method further comprises:
acquiring the data value range of each variable type;
and screening second abnormal data with values exceeding the data value range of the variable types from the observation data corresponding to each variable type in the target buoy observation sequence.
4. The method of claim 3, wherein after identifying the first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence, the method further comprises:
determining the first abnormal data and the second abnormal data as target abnormal data;
dividing the target abnormal data into continuous target abnormal data and single target abnormal data according to the observation time of the target abnormal data;
if the target abnormal data is single target abnormal data, performing interpolation processing on the target abnormal data;
and if the target abnormal data are continuous target abnormal data, rejecting the target abnormal data.
5. The method of claim 1, wherein the variable types include a location coordinate variable, a hydrological observation variable, and a meteorological observation variable;
the identifying the first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence comprises the following steps:
identifying first abnormal data of a position observation sequence corresponding to the position coordinate variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the position coordinate variable;
identifying first abnormal data of the hydrological observation sequence corresponding to the hydrological observation variable in the target buoy observation sequence according to an abnormal data identification mode corresponding to the hydrological observation variable;
and identifying first abnormal data of the meteorological observation sequence corresponding to the meteorological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the meteorological observation variable.
6. The method of claim 5, wherein the identifying the first abnormal data of the meteorological observation sequence corresponding to the meteorological observation variable in the target buoy observation sequence according to the abnormal data identification mode corresponding to the meteorological observation variable comprises:
filtering the meteorological observation sequence to obtain primary-screened meteorological abnormal data;
determining first abnormal data corresponding to meteorological observation variables from the primary screened meteorological abnormal data according to a difference value between the primary screened meteorological abnormal data and reference meteorological observation data corresponding to the primary screened meteorological abnormal data; and the reference meteorological observation data and the preliminary screening meteorological abnormal data are meteorological observation data adjacent to observation time in the meteorological observation sequence.
7. The method of claim 6, wherein the filtering the meteorological observation sequence to obtain preliminary screening meteorological abnormality data comprises:
dividing the meteorological observation sequence into short sequence observation data with preset length by using a time window;
acquiring a median and a median absolute deviation corresponding to short sequence observation data in a time window;
determining the value range of the observation data in the time window according to the median and the median absolute deviation;
and determining the meteorological observation data beyond the value range of the observation data in the short-sequence observation data as primary screening meteorological abnormal data.
8. A buoy data quality control device, the device comprising:
the buoy data acquisition module is used for acquiring an initial buoy observation sequence of the drifting buoy, and the initial buoy observation sequence comprises observation data of different variable types;
the observation data removing module is used for carrying out position detection on the initial buoy observation sequence, acquiring first buoy observation data with the data observation position as a land position, and removing the first buoy observation data to obtain a target buoy observation sequence;
the abnormal data identification module is used for identifying first abnormal data in the target buoy observation sequence according to the abnormal data identification mode corresponding to each variable type and the target buoy observation sequence;
the observation data eliminating module is used for acquiring swing angle amplitude data corresponding to the buoy observation data in the initial buoy observation sequence; if the amplitude data of the swing angle is zero, determining the buoy observation data as first buoy observation data; after acquiring first buoy observation data with a data observation position being a land position, detecting observation data corresponding to a hydrological observation variable in the first buoy observation data, detecting whether fault difference exists between the observation data and the observation data corresponding to the hydrological observation variable in the buoy observation data except the first buoy observation data, and if yes, determining the first buoy observation data as the buoy observation data with the data observation position being the land position, so as to realize secondary position detection on an initial buoy observation sequence.
9. A computer device, characterized in that the computer device comprises:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the processor to implement the buoy data quality control method of any one of claims 1 to 7.
10. A computer-readable storage medium, having stored thereon a computer program which is loaded by a processor for performing the steps of the method of buoy data quality control as claimed in any one of claims 1 to 7.
CN202210405927.8A 2022-04-18 2022-04-18 Buoy data quality control method and device, computer equipment and storage medium Active CN114492680B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210405927.8A CN114492680B (en) 2022-04-18 2022-04-18 Buoy data quality control method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210405927.8A CN114492680B (en) 2022-04-18 2022-04-18 Buoy data quality control method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114492680A CN114492680A (en) 2022-05-13
CN114492680B true CN114492680B (en) 2022-07-22

Family

ID=81489292

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210405927.8A Active CN114492680B (en) 2022-04-18 2022-04-18 Buoy data quality control method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114492680B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115630878B (en) * 2022-12-21 2023-08-22 国家卫星海洋应用中心 Quality control method and quality control device for buoy observation data
CN116304491B (en) * 2023-05-11 2023-08-08 长江三峡集团实业发展(北京)有限公司 Assimilation method and system for marine anomaly observation data
CN117408581B (en) * 2023-12-15 2024-03-26 青岛海洋科技中心 Method, system, computer and storage medium for controlling data quality of submerged buoy

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106945787A (en) * 2017-05-05 2017-07-14 国家海洋技术中心 One kind jettisons formula Air-sea heat fluxes buoy
CN109783846A (en) * 2018-12-06 2019-05-21 国家海洋局第一海洋研究所 Tidal level evaluation of uncertainty in measurement method based on GNSS oceanographic buoy
CN110081963A (en) * 2019-03-14 2019-08-02 哈尔滨工程大学 A kind of motor driven detects sonobuoy with vibration shape vector

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8195395B2 (en) * 2009-09-06 2012-06-05 The United States Of America As Represented By The Secretary Of Commerce System for monitoring, determining, and reporting directional spectra of ocean surface waves in near real-time from a moored buoy
CN107966242B (en) * 2017-11-22 2018-12-14 国家海洋局第一海洋研究所 One-touch baroceptor field calibration system and method suitable for deep ocean buoy
CN109033037A (en) * 2018-07-26 2018-12-18 厦门大学 Buoy automatic monitoring system data quality control method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106945787A (en) * 2017-05-05 2017-07-14 国家海洋技术中心 One kind jettisons formula Air-sea heat fluxes buoy
CN109783846A (en) * 2018-12-06 2019-05-21 国家海洋局第一海洋研究所 Tidal level evaluation of uncertainty in measurement method based on GNSS oceanographic buoy
CN110081963A (en) * 2019-03-14 2019-08-02 哈尔滨工程大学 A kind of motor driven detects sonobuoy with vibration shape vector

Also Published As

Publication number Publication date
CN114492680A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
CN114492680B (en) Buoy data quality control method and device, computer equipment and storage medium
CN114564629B (en) Abnormal data processing method and device, computer equipment and storage medium
Karpechko Predictability of sudden stratospheric warmings in the ECMWF extended-range forecast system
Young et al. The retrieval of profiles of particulate extinction from Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) data: Uncertainty and error sensitivity analyses
DeGaetano A quality-control routine for hourly wind observations
CN107340516B (en) Combined logic fast track starting method based on Doppler velocity
CN115342814B (en) Unmanned ship positioning method based on multi-sensor data fusion
Lucio-Eceiza et al. Quality control of surface wind observations in northeastern North America. Part II: Measurement errors
Collins The operational complex quality control of radiosonde heights and temperatures at the National Centers for Environmental Prediction. Part I: Description of the method
Lucio-Eceiza et al. Quality control of surface wind observations in Northeastern North America. Part I: Data management issues
CN116021981A (en) Method, device, equipment and storage medium for predicting ice coating faults of power distribution network line
CN111080976A (en) Method and device for monitoring natural gas leakage in real time under temperature change scene
Alerskans et al. A transformer neural network for predicting near‐surface temperature
JP5034021B2 (en) Wave estimation method, system and program
CN114280572B (en) Single radar echo quality control method, system and terminal for removing signal interference clutter
CN115545100A (en) GB-InSAR atmospheric phase compensation method based on LSTM
CN114780644A (en) Ship navigation data processing method, device, equipment and storage medium
Jiang et al. Data normalization and anomaly detection in a steel plate-girder bridge using LSTM
Oloyede et al. Data-driven techniques for temperature data prediction: big data analytics approach
Kar et al. XWaveNet: Enabling uncertainty quantification in short-term ocean wave height forecasts and extreme event prediction
CN117494040A (en) Power transmission line icing early warning method and system based on multi-element fusion
CN110751201B (en) SAR equipment task failure cause reasoning method based on textural feature transformation
CN115879034A (en) Tropical cyclone strength monitoring method, device and equipment based on machine learning
CN112561171B (en) Landslide prediction method, device, equipment and storage medium
Hasu et al. Automatic minimum and maximum alarm thresholds for quality control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant