CN115994168A - Method and device for checking data fluctuation, electronic equipment and storage medium - Google Patents

Method and device for checking data fluctuation, electronic equipment and storage medium Download PDF

Info

Publication number
CN115994168A
CN115994168A CN202111210569.7A CN202111210569A CN115994168A CN 115994168 A CN115994168 A CN 115994168A CN 202111210569 A CN202111210569 A CN 202111210569A CN 115994168 A CN115994168 A CN 115994168A
Authority
CN
China
Prior art keywords
data
boundary threshold
fluctuation
checking
interval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111210569.7A
Other languages
Chinese (zh)
Inventor
邓娟
龙克树
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guizhou Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guizhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guizhou Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202111210569.7A priority Critical patent/CN115994168A/en
Publication of CN115994168A publication Critical patent/CN115994168A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention provides a method, a device, equipment and a storage medium for checking data volatility, wherein the method comprises the following steps: generating a historical data acquisition script according to the data inspection parameters and the first script template; acquiring first historical data meeting data inspection parameters through a historical data acquisition script, and generating a first interval boundary threshold value and a second interval boundary threshold value according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold; generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template; and checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script. According to the technical scheme provided by the embodiment of the invention, the data fluctuation interval for detecting the data fluctuation can be automatically determined, so that the boundary threshold value of the data fluctuation interval is not dependent on manual setting and is updated rapidly along with the change of data, and the accuracy of data fluctuation detection is further improved.

Description

Method and device for checking data fluctuation, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of data traffic, and in particular, to a method and apparatus for checking data volatility, an electronic device, and a storage medium.
Background
Data volatility refers to the fluctuating behavior of data over a period of time due to traffic or due to data anomalies. With the rapid development of electronic technology and the great increase of business data volume, the need for checking for data volatility is increasing.
By manually configuring the inspection threshold of the data volatility, the method depends on the development experience of configuration personnel and is difficult to update frequently, so that the accuracy of the volatility inspection is low.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for checking data volatility, electronic equipment and a storage medium, so as to solve the problem of how to improve the accuracy of data volatility checking.
In order to solve the technical problems, the embodiment of the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides a method for checking data volatility, including:
generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
acquiring first historical data meeting the data inspection parameters through the historical data acquisition script, and generating a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold;
Generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and a second script template;
and checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script.
In a second aspect, an embodiment of the present invention provides an apparatus for checking data volatility, the apparatus comprising:
the first script generation module is used for generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
the boundary threshold generating module is used for obtaining first historical data meeting the data inspection parameters through the historical data acquisition script and generating a first interval boundary threshold and a second interval boundary threshold of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold;
the second script generation module is used for generating a data inspection script according to the first interval boundary threshold value, the second interval boundary threshold value, the data inspection parameters and the second script template;
and the volatility checking module is used for checking whether the volatility of the real-time data meeting the data checking parameters is normal or not through the data checking script.
In a third aspect, an embodiment of the present invention provides an electronic device, including: the memory and the processor store computer executable instructions that, when executed by the processor, implement the method for checking data volatility according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a storage medium having stored therein computer executable instructions which, when executed by a processor, implement the method for checking for data volatility according to the first aspect.
According to the technical scheme of the embodiment of the invention, firstly, a historical data acquisition script is generated according to data inspection parameters and a first script template input by a user; secondly, first historical data meeting data inspection parameters is obtained through a historical data acquisition script, and a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval are generated according to the first historical data; then, the first interval boundary threshold is smaller than the second interval boundary threshold; generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template; and finally, checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script. According to the technical scheme provided by the embodiment of the invention, the data fluctuation interval for detecting the data fluctuation can be automatically determined, so that the boundary threshold value of the data fluctuation interval is not dependent on manual setting and is updated rapidly along with the change of the data, and the accuracy of data fluctuation detection is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for inspecting data volatility according to one or more embodiments of the present invention;
FIG. 2 is a flow chart illustrating a method for obtaining a boundary threshold of a data fluctuation interval according to one or more embodiments of the present invention;
FIG. 3a is a schematic diagram of first historical data without stationarity prior to differential processing provided by one or more embodiments of the present invention;
FIG. 3b is a schematic diagram of first historical data with stationarity after differential processing according to one or more embodiments of the present invention;
FIG. 4 is a schematic diagram illustrating a data flow of an apparatus for inspecting data volatility according to one or more embodiments of the present invention;
FIG. 5 is a schematic block diagram of an apparatus for inspecting data volatility provided by one or more embodiments of the present invention;
Fig. 6 is a schematic structural diagram of an electronic device according to one or more embodiments of the present invention.
Detailed Description
In order to make the technical solution of the present invention better understood by those skilled in the art, the technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, shall fall within the scope of the invention.
Data volatility refers to the fluctuating behavior of data over a period of time due to traffic or due to data anomalies. The data fluctuation inspection device may be an inspection device designed to ensure that the data fluctuation is within a predetermined reasonable range, and to automatically perform inspection when the data is abnormally fluctuated. The method for checking the data fluctuation may be performed on a device for checking the data fluctuation, and the method for checking the data fluctuation may be used for checking whether the data fluctuation is normal.
For example, the electronic device needs to check the data volatility of the 5G (5 th Generation Mobile Communication Technology, fifth generation mobile communication technology) user, ensure that the daily volatility of the 5G user is kept within a reasonable volatility range, and when the volatility value exceeds a predetermined volatility range, early warning is needed, and the information configured can be shown in the following table.
Table 1 shows a parameter configuration table of a data volatility check rule adopted in a volatility detection method of a 5G user number. The parameter configuration table contains service data to be checked, a checking period, an upper limit of a fluctuation range and a lower limit of the fluctuation range.
Service data Inspection cycle Upper limit of fluctuation range Lower limit of fluctuation range
5G user number Day of the day 10% -10%
TABLE 1
On the one hand, when the upper limit and the lower limit of the fluctuation range are set manually, the formulation of the fluctuation range depends on the development experience of configuration personnel, and the lack of observation and dynamic measurement on the periodic change of data can cause abnormal fluctuation false alarm or false omission.
On the other hand, when the data volatility checking rule is configured according to the upper limit of the fluctuation range and the lower limit of the fluctuation range, configuration personnel are required to customize the rule of singleness one by one, and labor is wasted.
To overcome the above problems, the present invention provides a plurality of embodiments as shown below.
Fig. 1 is a flow chart of a method for checking data volatility according to one or more embodiments of the present invention.
Referring to fig. 1, the method for checking the fluctuation of data includes steps S102, S104, S106, and S108. The method of checking for data volatility in the example embodiment of fig. 1 is described in detail below.
Step S102, generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user.
The data checking parameter may be a screening condition parameter for screening data from a designated storage space, for example, the storage space contains X databases, one database contains Y tables, one table contains Z fields, and the field value of each field may change with time, so that the designated data, for example, the field value in one day of the field "5G user" of the table 2 contained in the database identified as 001, may be screened according to the identification of the database, the table name, the field name and the data checking period.
The first script template may be a pre-built SQL (Structured Query Language ) template. By filling in the data inspection parameters input by the user in the SQL template, an SQL sentence recognizable and executable by the electronic device can be generated, and the SQL sentence can be used for collecting historical data conforming to the data inspection parameters.
For example, when the user inputs the target table name "table 2", the target field name "5G user number" and the data checking period "day", the electronic device generates an SQL statement for collecting the field value of the field "5G user number" of table 2 within one day according to the target table name "table 2", the target field name "5G user number" and the data checking period "day" and the pre-constructed SQL template.
In particular implementations, the historical data may correspond to a first period and the real-time data may correspond to a second period. The first period is located before the second period. The historical data may be all data collected in the first period and the real-time data may be data collected at a current point in time that is within the second period, the real-time data being obviously not all data collected in the second period.
Historical data and real-time data are described herein by way of a simple example. For example, the data inspection parameters include a target database identification "001", a target table name "table 2", a target field name "5G user number", and a data inspection period "day". Here, according to the data check period "day", yesterday 0 to today 0 may be taken as the first period, and today 0 to tomorrow 0 may be taken as the second period. The real-time data collected at 10 points today may be the field value of field "5G user" of table 2 contained in the database identified as 001 at 10 points today of the electronic device, while the history data may be the entire field value of field "5G user" of table 2 contained in the database identified as 001 from 0 points yesterday to 0 points today of the electronic device.
Optionally, the data checking parameter includes a target table name, a target field name, and a target data checking period; before step S102 is performed, the checking method of data volatility further includes: reading pre-stored table structure information; the table structure information comprises at least one table name and at least one field name corresponding to each table name; generating a configuration page of the data checking parameters according to the table structure information; receiving a first selection operation submitted for a target table name, a second selection operation submitted for a target field name, a third selection operation submitted for a data check period and a configuration confirmation operation at a configuration page; and determining a target table name, a target field name and a target data checking period which are input by a user according to the first selection operation, the second selection operation, the third selection operation and the configuration confirmation operation.
The pre-stored table structure information may include at least one table name and at least one field name corresponding to each table name. The table structure information may be retrieved from a designated storage space and stored centrally at a designated location.
In specific implementation, the electronic device may construct a visual interface for selecting a table name generated based on the table structure information, where a user may select a target table name from at least one table name. After the user selects the target table name, at least one field name corresponding to the target table name may be displayed in the visual interface. The user may sort the target field name among the at least one field name. After the user selects the name of the target field, a plurality of preset data checking periods can be displayed in the visual interface, and the user selects the target data checking period from the plurality of preset data checking periods. After the user has checked the target data inspection period, the user may perform a configuration validation operation, such as clicking on a "validation" control located at the lowermost portion of the visual interface. The electronic device may determine the target table name, the target field name, and the target data check period input by the user according to the foregoing respective user operations.
Step S104, obtaining first historical data meeting data inspection parameters through a historical data acquisition script, and generating a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold.
The electronic device can run SQL sentences, screen data in the storage space and obtain first historical data meeting data inspection parameters. The data in the storage space may be historical business data, such as historical order data of an operator, historical ticket data, and the like.
The first historical data, for example, the storage space contains X databases storing historical order data, one database contains Y tables, one table contains Z fields, and the field value of each field may change with time, and then the specified historical order data, for example, the field value in one day of the field '5G user' of the table 2 contained in the database identified as 001, can be selected from the stored historical order data according to the identification of the database, the table name, the field name and the data checking period.
The first interval boundary threshold of the data fluctuation interval may be a minimum value of the data fluctuation interval, and the second interval boundary threshold of the data fluctuation interval may be a maximum value of the data fluctuation interval. The unique data fluctuation interval can be positioned through the minimum value and the maximum value so as to check whether the fluctuation of the data is normal or not. Wherein, the normal fluctuation of the data means that the data is located in a data fluctuation interval, and the abnormal fluctuation of the data means that the data is located outside the data fluctuation interval.
Optionally, generating a first interval boundary threshold and a second interval boundary threshold of the data fluctuation interval according to the first historical data includes: preprocessing and stabilizing the first historical data to obtain second historical data; according to the second historical data and the time sequence prediction model, obtaining a plurality of prediction data corresponding to the second historical data; and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data.
This can be illustrated in connection with fig. 2. Fig. 2 is a flowchart illustrating a process for obtaining a boundary threshold of a data fluctuation interval according to one or more embodiments of the present invention.
As shown in fig. 2, in step S202, the preprocessed first history data is input.
Step S204, judging whether the preprocessed first historical data has stability.
If yes, go to step S208; if not, step S206 is performed.
Step S206, differential processing.
The electronic device performs the difference processing on the first history data without stability, and returns to step S204 after performing the difference processing.
Step S208, a plurality of prediction data are acquired.
And obtaining a plurality of prediction data corresponding to the second historical data according to the second historical data and the time sequence prediction model.
Step S210, calculating a first interval boundary threshold and a second interval boundary threshold.
According to the plurality of prediction data corresponding to the second historical data, calculating to obtain a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval
Optionally, preprocessing and stabilizing the first historical data to obtain second historical data, including: preprocessing the first historical data; judging whether the preprocessed first historical data has stationarity or not by a unit root checking method; if yes, determining the preprocessed first historical data as second historical data; if not, carrying out differential processing on the preprocessed first historical data until the first historical data after differential processing is determined to have stationarity, and determining the first historical data with stationarity after differential processing as second historical data.
Preprocessing includes, but is not limited to, complementing missing values, identifying or deleting outliers, solving inconsistencies caused by data dimensions, units and the like, completing format standardization, abnormal data clearing, error correction and repeated data clearing.
Sequence stationarity is a precondition for performing time series analysis, and stationarity is the requirement that a fitted curve obtained via a sample time series can continue along an existing morphology in a future period of time. The mean and variance of the sequence do not change significantly from the autoregressive coefficients.
The electronic device may perform a time series analysis of the preprocessed first historical data. The unit root test of the time series is used for judging the stability of the time series and comprises three test methods of DF (Dickey and Fuller, 1979), ADF (authorized diode-Fuller, 1981) and PP (Phillips and Perron, 1988). Specifically, the electronic device may determine whether the preprocessed first historical data has stationarity by using a DF unit root test method. This may be achieved here by a third party open source tool.
And if the judgment result is that the preprocessed first historical data has stationarity, determining the preprocessed first historical data as second historical data.
And if the judgment result is that the preprocessed first historical data does not have stationarity, carrying out differential processing on the preprocessed first historical data. Judging whether the first historical data subjected to differential processing has stationarity or not, if so, determining the first historical data subjected to differential processing as second historical data; if not, performing a second differential processing … … on the differential processed first history data until the differential processed first history data has stationarity, and determining the differential processed first history data as second history data. Usually, only unstable data is subjected to differential processing for one time and two times, so that data with stability can be obtained.
The reason why the stabilization processing is required for the first history data is as follows: in time series calculation, unstable data can cause larger errors of the output result of the model, so that the model can not be used at all even if the output result of the model reaches more than 98%, for example, in a pile of products, the model can tell you that the products are flawless despite the fact that the inferior products are found, and the reason is likely that the input data of the model are unstable. Data stabilization is a fundamental requirement of a time series model on input data.
According to the embodiment of the invention, the first historical data which is unstable is subjected to stabilization processing by adopting differential processing, so that the data is as stable as possible, and the accuracy of a model is improved. The difference principle is that the data of equal period interval is linearly subtracted, namely, the data is later subtractedSubtracting the current time point from the value of the time point, wherein the formula is y t -y t-1 I.e. the value of the latter point in time minus the current point in time. Reference may be made to fig. 3a and 3b before and after the difference processing of the preprocessed first history data.
FIG. 3a is a schematic diagram of first historical data without stationarity prior to differential processing provided by one or more embodiments of the present invention; FIG. 3b is a schematic diagram of first historical data with stationarity after differential processing according to one or more embodiments of the present invention.
As shown in fig. 3a, the abscissa represents time t, and the ordinate represents the number of 5G users corresponding to time t in the first history data, denoted by x (t). As shown in fig. 3b, the abscissa represents time t, and the ordinate represents the difference value of the number of 5G users corresponding to time t in the first history data, denoted by diff [ x (t) ]. When diff [ x (t) ] is around 0, the value representing the number x (t) of 5G users corresponding to time t in the first history data has stationarity.
And obtaining a plurality of prediction data corresponding to the second historical data according to the second historical data and the time sequence prediction model. Wherein the time series prediction model may be a differential autoregressive moving average model. The model formula is as follows:
Figure BDA0003308698390000091
y in formula (1) t Is the current value, μ is a constant term, γ i Is an autocorrelation coefficient, epsilon t Is the error, p is the order of the autocorrelation system, and q is the moving average system order. The accumulation symbol under i may represent the addition of a function of all values taking i as a variable. In practical applications, the values of p and q may each be 1.
Optionally, according to a plurality of prediction data corresponding to the second historical data, a first interval boundary threshold and a second interval boundary threshold of the data fluctuation interval are calculated, including: calculating to obtain the average absolute error of the predicted data according to a plurality of predicted data corresponding to the second historical data; determining a highest value and a lowest value of the plurality of predicted data; and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to the average absolute error, the highest value and the lowest value.
In particular, the plurality of prediction data obtained may be fitted using a mean square error MSE (Mean Square Error ). Meanwhile, MAE (Mean Absolute Error, average absolute error) of the predicted data is taken as the fluctuation range limit of the predicted data. Let the variable to be predicted be y= (Y) 1 ,y 2 ,...,y n ) And the predicted value is
Figure BDA0003308698390000101
MSE and MAE are defined as follows:
Figure BDA0003308698390000102
the following is illustrated for the application of equation (2): if the true value of the variable to be predicted is y= (1,1,2,1), the current predicted value is:
Figure BDA0003308698390000103
then:
MSE=1/4*[(1-0.6) 2 +(1-0.6) 2 +(2-1.6) 2 +(1-0.6) 2 ]=0.16
MAE=1/4*[|1-0.6|+|1-0.6|+|2-1.6|+|1-0.6|]=0.4
and summing and calculating the fluctuation range of the next period of each sample field through the predicted value estimated by the model and the average absolute error obtained when the model is trained, and obtaining the upper and lower thresholds of the field where the data quantity of the next period is located. I.e. upper range of fluctuation=1.6+0.4=2, lower range of fluctuation=0.6+0.4=1.
The embodiment of the invention can automatically generate the fluctuation range threshold value, automatically adjust the range threshold value according to the historical fluctuation condition of the data, cover the historical periodic variation trend of the data volume, finish the correct measurement of the fluctuation of the data volume by a machine learning method and improve the completeness and recognition capability of the fluctuation rule inspection.
Step S106, generating a data checking script according to the first interval boundary threshold, the second interval boundary threshold, the data checking parameters and the second script template.
The second script template may be a pre-built SQL (Structured Query Language ) template. By filling the first interval boundary threshold, the second interval boundary threshold and the data checking parameter in the SQL template, an SQL sentence which can be identified and executed by the electronic equipment can be generated, and the SQL sentence can be used for collecting real-time data conforming to the data checking parameter and generating the deviation amount between the real-time data and the first interval boundary threshold and the second interval boundary threshold respectively.
Step S108 checks whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script.
When the real-time data is positioned in the data fluctuation interval, the fluctuation of the real-time data is normal; when the real-time data is located outside the data fluctuation interval, the fluctuation of the real-time data is abnormal.
Optionally, checking whether the volatility of the real-time data satisfying the data checking parameters is normal by the data checking script includes: running a data checking script to obtain a difference value between the real-time data and a first interval boundary threshold value as a first difference value, and obtaining a difference value between the real-time data and a second interval boundary threshold value as a second difference value; if the first difference value is greater than or equal to zero and the second difference value is less than or equal to zero, determining that the fluctuation of the real-time data is normal; if the first difference value is smaller than zero and the second difference value is smaller than zero, determining that the fluctuation of the real-time data is abnormal; if the first difference value is larger than zero and the second difference value is larger than zero, determining that the fluctuation of the real-time data is abnormal.
By running the data checking script, real-time data can be acquired, and the difference value between the real-time data and the first interval boundary threshold value is obtained and used as a first difference value, and the difference value between the real-time data and the second interval boundary threshold value is obtained and used as a second difference value. The electronic device can judge whether the fluctuation of the real-time data is normal or not according to the comparison result of the first difference value and zero and the comparison result of the second difference value and zero.
The method comprises the following steps:
if the first difference value is greater than or equal to zero and the second difference value is less than or equal to zero, the real-time data is greater than or equal to the minimum value of the data fluctuation interval and less than or equal to the maximum value of the data fluctuation interval, namely the real-time data is positioned in the data fluctuation interval, so that the fluctuation of the real-time data is determined to be normal;
if the first difference value is smaller than zero and the second difference value is smaller than zero, the real-time data is smaller than the minimum value of the data fluctuation interval and is not in the data fluctuation interval, so that abnormal fluctuation of the real-time data is determined;
if the first difference value is larger than zero and the second difference value is larger than zero, the real-time data is larger than the maximum value of the data fluctuation interval and is not in the data fluctuation interval, and therefore abnormal fluctuation of the real-time data is determined.
Optionally, after the step S108 is performed, the method for checking data volatility further includes: if the fluctuation of the real-time data is determined to be normal, determining that the auditing result of the real-time data is normal; if the fluctuation of the real-time data is abnormal, determining that the auditing result of the real-time data is abnormal; and generating an alarm notification aiming at the real-time data with abnormal auditing results.
If the fluctuation of the real-time data is determined to be normal, determining that the auditing result of the fluctuation of the real-time data is normal, namely, auditing the fluctuation of the real-time data; if the fluctuation of the real-time data is abnormal, determining that the auditing result of the fluctuation of the real-time data is abnormal, namely that the fluctuation of the real-time data does not pass the auditing. When the volatility of the real-time data does not pass the audit, generating an alarm notification aiming at the real-time data with abnormal audit result, wherein the alarm notification can carry information such as table name, field name, data volume and the like corresponding to the real-time data. The alarm notification can be in the form of a short message, an email, a message (Kafka, redis, activeMQ, etc.), etc.
According to the checking method of data volatility in the example embodiment of fig. 1, first, a history data collection script is generated according to data checking parameters input by a user and a first script template; secondly, first historical data meeting data inspection parameters is obtained through a historical data acquisition script, and a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval are generated according to the first historical data; then, the first interval boundary threshold is smaller than the second interval boundary threshold; generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template; and finally, checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script. According to the technical scheme provided by the embodiment of the invention, the data fluctuation interval for detecting the data fluctuation can be automatically determined, so that the boundary threshold value of the data fluctuation interval is not dependent on manual setting and is updated rapidly along with the change of the data, and the accuracy of data fluctuation detection is further improved.
Fig. 4 is a schematic data flow diagram of an apparatus for checking data volatility according to one or more embodiments of the present invention.
Referring to fig. 4, the apparatus for checking data volatility includes a table structure information acquisition unit 401, a volatility rule configuration unit 402, a data sample template unit 403, a volatility range threshold value automatic generation unit 404, a rule auditing unit 405, a check SQL template unit 406, and an alarm unit 407.
The table structure information obtaining unit 401 is used for collecting table structure information, making basic preparation for checking data volatility, and the table structure information obtaining unit 401 collects the table structure information and stores the table structure information in a warehouse. Specifically, collecting table structure information includes: acquiring all table structure information in a warehouse, wherein the table structure information comprises table names and table structure field information; and (5) warehousing the table structure information: and warehousing the acquired information such as the table structure, and intensively managing the table structure for preparing data serving as a next unit. A repository here refers to a database for storing huge amounts of business data.
A volatility rule configuring unit 402 for configuring a volatility checking rule, the volatility rule configuring unit 402 performing the steps of:
(1) Reading data table information: the table structure information acquired by the table structure information acquisition unit 401 is read.
(2) Selecting a table to be checked: and constructing a visual interface capable of choosing the names of the tables, and selecting the table structure information read in the last step.
(3) The field to be checked is checked: after selecting the table structure information, an interface is provided for selecting the fields contained in the table to be checked selected in the previous step, and the target field for calculating the volatility index is selected through the interface.
(4) The period to be checked is checked: an interface is provided for selecting the data period of the table to be checked selected in the previous step, and the period to be used for calculating the fluctuation index is selected through the interface.
(5) Invoking an SQL template: the SQL template is called from the data sample template unit 403.
(6) Automatically generating historical data SQL according to the template: filling parameters such as table names, target fields, calculation periods and the like into the called SQL template to generate corresponding SQL scripts; and running the SQL script, acquiring historical data meeting the table, the field and the period checked by the user, and storing the historical data as a next unit for data preparation.
A data sample template unit 403, configured to generate and store a historical data sample SQL template.
And the fluctuation range threshold automatic generation unit 404 is configured to automatically generate a boundary threshold of the data fluctuation interval according to the acquired history data. The fluctuation range threshold automatic generation unit 404 performs the steps of: acquiring an inspection SQL; executing check SQL to obtain all historical data; executing the time sequence model; and outputting a fluctuation range threshold.
In a specific implementation, the automatic fluctuation range threshold generating unit 404 may perform preprocessing and smoothing processing on the first historical data to obtain second historical data; according to the second historical data and the time sequence prediction model, obtaining a plurality of prediction data corresponding to the second historical data; and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data. The steps performed by the automatic fluctuation range threshold generation unit 404 may refer to the embodiment shown in fig. 1, and will not be described herein.
And the rule auditing unit 405 is used for executing and auditing the fluctuation checking rule. The rule auditing unit performs the steps of:
(1) Acquiring a fluctuation range threshold value: the first interval boundary threshold and the second interval boundary threshold output by the fluctuation range threshold automatic generation unit 404 are obtained and used as the data basis for rule auditing.
(2) Generating a check sample SQL script: the detection sample SQL template in the detection SQL template unit 406 is called, and the first interval boundary threshold and the second interval boundary threshold are filled in the detection sample SQL template to generate a detection sample SQL script.
(3) Executing the volatility check SQL: and running a fluctuation check SQL script, obtaining the statistical value of the period and the deviation between the statistical value and the fluctuation range threshold value, and storing the statistical value and the deviation.
(4) Recording the inspection result value: and judging whether the statistical value is between the first interval boundary threshold and the second interval boundary threshold according to the deviation between the statistical value and the fluctuation range threshold, if so, outputting to pass the audit, and if not, outputting to not pass the audit.
The check SQL template unit 406 is used for generating and storing a check sample SQL template.
The alarm unit 407 performs alarm notification according to the information such as the list name, the list field, the data volume and the like which do not pass through the fluctuation audit, and may adopt modes such as a short message, an email, a message (Kafka, redis, activeMQ and the like) and the like.
In addition, the apparatus for checking data volatility provided in this embodiment may be constructed as follows:
(1) The table structure information obtaining unit 401 is configured to establish a configuration table in the metadata management system according to the data information corresponding to the source table, and extract the data information corresponding to the source table according to the configuration table according to a preset period. And establishing a configuration table, and extracting data corresponding to the table according to a preset period according to the configuration table.
(2) The configuration unit 402 of the volatility rule is established to configure the information such as the target table, the field, the period and the like required by completing different requirements, and the configuration of the volatility check rule which needs to be input by SQL one by one in the past is simplified by a checking mode.
(3) The data sample template unit 403 is set up in order to put into automated volatility calculation for manually configured volatility audit needs. Sample data is provided for the next model build.
(4) In order to model the rule configured by the unit and the collected sample data, and output a fluctuation threshold, an automatic fluctuation threshold range generating unit 404 is established, and the calculation of the fluctuation rule threshold, which is conventionally configured by manual experience, is completed, and the threshold range is adjusted reasonably according to the change of the data period.
(5) The check SQL template unit 406 is built to complete the data link of the volatility threshold value calculation result and the auditing unit. And (3) finishing automation and intelligent audit judgment through the developed template.
(6) A rule auditing unit 406 is established to audit the demands to be audited, complete the audit and output the audit result.
(7) An alarm unit 407 is established to alarm according to the auditing result.
The apparatus for checking data volatility provided by the embodiment shown in fig. 4 can implement the respective processes in the foregoing embodiment of the method for checking data volatility, and achieve the same functions and effects, which are not repeated here.
Fig. 5 is a schematic block diagram of an apparatus for inspecting data volatility according to one or more embodiments of the present invention.
Referring to fig. 5, the data fluctuation inspection apparatus 500 includes:
the first script generating module 501 is configured to generate a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
the boundary threshold generating module 502 is configured to obtain, through a history data acquisition script, first history data that meets data inspection parameters, and generate, according to the first history data, a first interval boundary threshold and a second interval boundary threshold of a data fluctuation interval; the first interval boundary threshold is less than the second interval boundary threshold;
a second script generating module 503, configured to generate a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameter, and the second script template;
and the volatility checking module 504 is used for checking whether the volatility of the real-time data meeting the data checking parameters is normal or not through the data checking script.
In some embodiments of the present invention, based on the above scheme, the boundary threshold generating module 502 includes:
the historical data processing unit is used for preprocessing and stabilizing the first historical data to obtain second historical data;
a predicted data obtaining unit, configured to obtain a plurality of predicted data corresponding to the second historical data according to the second historical data and the time sequence prediction model;
the boundary threshold calculating unit is used for calculating a first interval boundary threshold and a second interval boundary threshold of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data.
In some embodiments of the present invention, based on the above-described scheme, the data inspection parameters include a target table name, a target field name, and a target data inspection period; the apparatus 500 for checking data fluctuation further includes:
the table structure information reading module is used for reading the pre-stored table structure information; the table structure information comprises at least one table name and at least one field name corresponding to each table name;
the configuration page generation module is used for generating a configuration page of the data checking parameters according to the table structure information;
the user operation receiving module is used for receiving a first selection operation submitted for a target table name, a second selection operation submitted for a target field name, a third selection operation submitted for a data checking period and a configuration confirmation operation on the configuration page;
And the target parameter determining module is used for determining a target table name, a target field name and a target data checking period which are input by a user according to the first selection operation, the second selection operation, the third selection operation and the configuration confirmation operation.
In some embodiments of the present invention, based on the above scheme, the volatility checking module 504 is specifically configured to:
running a data checking script to obtain a difference value between the real-time data and a first interval boundary threshold value as a first difference value, and obtaining a difference value between the real-time data and a second interval boundary threshold value as a second difference value;
if the first difference value is greater than or equal to zero and the second difference value is less than or equal to zero, determining that the fluctuation of the real-time data is normal;
if the first difference value is smaller than zero and the second difference value is smaller than zero, determining that the fluctuation of the real-time data is abnormal;
if the first difference value is larger than zero and the second difference value is larger than zero, determining that the fluctuation of the real-time data is abnormal.
In some embodiments of the present invention, based on the above scheme, the inspection apparatus 500 for data volatility further includes:
the first result determining module is used for determining that the auditing result of the real-time data is normal if the fluctuation of the real-time data is determined to be normal;
The second result determining module is used for determining that the auditing result of the real-time data is abnormal if the fluctuation of the real-time data is abnormal;
and the alarm notification generation module is used for generating alarm notification aiming at real-time data with abnormal auditing results.
In some embodiments of the present invention, based on the above scheme, the boundary threshold calculating unit is specifically configured to:
calculating to obtain the average absolute error of the predicted data according to a plurality of predicted data corresponding to the second historical data;
determining a highest value and a lowest value of the plurality of predicted data;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to the average absolute error, the highest value and the lowest value.
In some embodiments of the present invention, based on the above scheme, the history data processing unit is specifically configured to:
preprocessing the first historical data;
judging whether the preprocessed first historical data has stationarity or not by a unit root checking method;
if yes, determining the preprocessed first historical data as second historical data;
if not, carrying out differential processing on the preprocessed first historical data until the first historical data after differential processing is determined to have stationarity, and determining the first historical data with stationarity after differential processing as second historical data.
According to the technical scheme of the embodiment of the invention, firstly, a historical data acquisition script is generated according to data inspection parameters and a first script template input by a user; secondly, first historical data meeting data inspection parameters is obtained through a historical data acquisition script, and a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval are generated according to the first historical data; then, the first interval boundary threshold is smaller than the second interval boundary threshold; generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template; and finally, checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script. According to the technical scheme provided by the embodiment of the invention, the data fluctuation interval for detecting the data fluctuation can be automatically determined, so that the boundary threshold value of the data fluctuation interval is not dependent on manual setting and is updated rapidly along with the change of the data, and the accuracy of data fluctuation detection is further improved.
The data fluctuation checking device provided by one or more embodiments of the present invention can implement each process in the foregoing data fluctuation checking method embodiment, and achieve the same function and effect, which are not repeated here.
Further, an electronic device is provided in the embodiments of the present invention, and fig. 6 is a schematic structural diagram of an electronic device provided in one or more embodiments of the present invention, as shown in fig. 6, where the electronic device includes a memory 601, a processor 602, a bus 603, and a communication interface 604. The memory 601, processor 602, and communication interface 604 communicate over a bus 603, and the communication interface 604 may include input and output interfaces including, but not limited to, a keyboard, mouse, display, microphone, loudspeaker, and the like.
In fig. 6, memory 601 has stored thereon computer executable instructions that, when executed by processor 602, enable the following flow to be achieved:
generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
acquiring first historical data meeting data inspection parameters through a historical data acquisition script, and generating a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold;
generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template;
And checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script.
Optionally, the computer executable instructions, when executed by the processor 702, generate a first interval boundary threshold and a second interval boundary threshold for the data fluctuation interval from the first historical data, comprising:
preprocessing and stabilizing the first historical data to obtain second historical data;
according to the second historical data and the time sequence prediction model, obtaining a plurality of prediction data corresponding to the second historical data;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data.
Optionally, the computer-executable instructions, when executed by the processor 702, include a target table name, a target field name, and a target data inspection period; before generating the historical data acquisition script according to the data inspection parameters and the first script template input by the user, the method further comprises the following steps:
reading pre-stored table structure information; the table structure information comprises at least one table name and at least one field name corresponding to each table name;
Generating a configuration page of the data checking parameters according to the table structure information;
receiving a first selection operation submitted for a target table name, a second selection operation submitted for a target field name, a third selection operation submitted for a data check period and a configuration confirmation operation at a configuration page;
and determining a target table name, a target field name and a target data checking period which are input by a user according to the first selection operation, the second selection operation, the third selection operation and the configuration confirmation operation.
Optionally, the computer executable instructions, when executed by the processor 702, check, by the data checking script, whether the volatility of the real-time data satisfying the data checking parameters is normal, including:
running a data checking script to obtain a difference value between the real-time data and a first interval boundary threshold value as a first difference value, and obtaining a difference value between the real-time data and a second interval boundary threshold value as a second difference value;
if the first difference value is greater than or equal to zero and the second difference value is less than or equal to zero, determining that the fluctuation of the real-time data is normal;
if the first difference value is smaller than zero and the second difference value is smaller than zero, determining that the fluctuation of the real-time data is abnormal;
If the first difference value is larger than zero and the second difference value is larger than zero, determining that the fluctuation of the real-time data is abnormal.
Optionally, after checking whether the volatility of the real-time data satisfying the data checking parameters is normal by the data checking script, the computer executable instructions, when executed by the processor 702, may further perform the following flow:
if the fluctuation of the real-time data is determined to be normal, determining that the auditing result of the real-time data is normal;
if the fluctuation of the real-time data is abnormal, determining that the auditing result of the real-time data is abnormal;
and generating an alarm notification aiming at the real-time data with abnormal auditing results.
Optionally, the computer executable instructions, when executed by the processor 702, calculate a first interval boundary threshold and a second interval boundary threshold of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data, including:
calculating to obtain the average absolute error of the predicted data according to a plurality of predicted data corresponding to the second historical data;
determining a highest value and a lowest value of the plurality of predicted data;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to the average absolute error, the highest value and the lowest value.
Optionally, the computer executable instructions, when executed by the processor 702, perform preprocessing and smoothing processing on the first history data to obtain second history data, including:
preprocessing the first historical data;
judging whether the preprocessed first historical data has stationarity or not by a unit root checking method;
if yes, determining the preprocessed first historical data as second historical data;
if not, carrying out differential processing on the preprocessed first historical data until the first historical data after differential processing is determined to have stationarity, and determining the first historical data with stationarity after differential processing as second historical data.
According to the technical scheme of the embodiment of the invention, firstly, a historical data acquisition script is generated according to data inspection parameters and a first script template input by a user; secondly, first historical data meeting data inspection parameters is obtained through a historical data acquisition script, and a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval are generated according to the first historical data; then, the first interval boundary threshold is smaller than the second interval boundary threshold; generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template; and finally, checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script. According to the technical scheme provided by the embodiment of the invention, the data fluctuation interval for detecting the data fluctuation can be automatically determined, so that the boundary threshold value of the data fluctuation interval is not dependent on manual setting and is updated rapidly along with the change of the data, and the accuracy of data fluctuation detection is further improved.
The electronic device provided by the embodiment of the invention can realize each process in the embodiment of the method for checking the fluctuation of the data and achieve the same functions and effects, and the process is not repeated here.
Further, embodiments of the present invention also provide a storage medium having stored therein computer executable instructions that, when executed by the processor 702, enable the following flow to be achieved:
generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
acquiring first historical data meeting data inspection parameters through a historical data acquisition script, and generating a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold;
generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template;
and checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script.
Optionally, the computer executable instructions, when executed by the processor 702, generate a first interval boundary threshold and a second interval boundary threshold for the data fluctuation interval from the first historical data, comprising:
Preprocessing and stabilizing the first historical data to obtain second historical data;
according to the second historical data and the time sequence prediction model, obtaining a plurality of prediction data corresponding to the second historical data;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data.
Optionally, the computer-executable instructions, when executed by the processor 702, include a target table name, a target field name, and a target data inspection period; before generating the historical data acquisition script according to the data inspection parameters and the first script template input by the user, the method further comprises the following steps:
reading pre-stored table structure information; the table structure information comprises at least one table name and at least one field name corresponding to each table name;
generating a configuration page of the data checking parameters according to the table structure information;
receiving a first selection operation submitted for a target table name, a second selection operation submitted for a target field name, a third selection operation submitted for a data check period and a configuration confirmation operation at a configuration page;
And determining a target table name, a target field name and a target data checking period which are input by a user according to the first selection operation, the second selection operation, the third selection operation and the configuration confirmation operation.
Optionally, the computer executable instructions, when executed by the processor 702, check, by the data checking script, whether the volatility of the real-time data satisfying the data checking parameters is normal, including:
running a data checking script to obtain a difference value between the real-time data and a first interval boundary threshold value as a first difference value, and obtaining a difference value between the real-time data and a second interval boundary threshold value as a second difference value;
if the first difference value is greater than or equal to zero and the second difference value is less than or equal to zero, determining that the fluctuation of the real-time data is normal;
if the first difference value is smaller than zero and the second difference value is smaller than zero, determining that the fluctuation of the real-time data is abnormal;
if the first difference value is larger than zero and the second difference value is larger than zero, determining that the fluctuation of the real-time data is abnormal.
Optionally, after checking whether the volatility of the real-time data satisfying the data checking parameters is normal by the data checking script, the computer executable instructions, when executed by the processor 702, may further perform the following flow:
If the fluctuation of the real-time data is determined to be normal, determining that the auditing result of the real-time data is normal;
if the fluctuation of the real-time data is abnormal, determining that the auditing result of the real-time data is abnormal;
and generating an alarm notification aiming at the real-time data with abnormal auditing results.
Optionally, the computer executable instructions, when executed by the processor 702, calculate a first interval boundary threshold and a second interval boundary threshold of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data, including:
calculating to obtain the average absolute error of the predicted data according to a plurality of predicted data corresponding to the second historical data;
determining a highest value and a lowest value of the plurality of predicted data;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to the average absolute error, the highest value and the lowest value.
Optionally, the computer executable instructions, when executed by the processor 702, perform preprocessing and smoothing processing on the first history data to obtain second history data, including:
preprocessing the first historical data;
judging whether the preprocessed first historical data has stationarity or not by a unit root checking method;
If yes, determining the preprocessed first historical data as second historical data;
if not, carrying out differential processing on the preprocessed first historical data until the first historical data after differential processing is determined to have stationarity, and determining the first historical data with stationarity after differential processing as second historical data.
According to the technical scheme of the embodiment of the invention, firstly, a historical data acquisition script is generated according to data inspection parameters and a first script template input by a user; secondly, first historical data meeting data inspection parameters is obtained through a historical data acquisition script, and a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval are generated according to the first historical data; then, the first interval boundary threshold is smaller than the second interval boundary threshold; generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and the second script template; and finally, checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script. According to the technical scheme provided by the embodiment of the invention, the data fluctuation interval for detecting the data fluctuation can be automatically determined, so that the boundary threshold value of the data fluctuation interval is not dependent on manual setting and is updated rapidly along with the change of the data, and the accuracy of data fluctuation detection is further improved.
The storage medium provided in one or more embodiments of the present invention can implement each of the processes in the foregoing embodiments of the method for checking data volatility, and achieve the same functions and effects, which are not repeated here.
The storage medium is, for example, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, an optical disk, or the like.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method of the above-mentioned embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims (10)

1. A method of checking data volatility, comprising:
generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
acquiring first historical data meeting the data inspection parameters through the historical data acquisition script, and generating a first interval boundary threshold value and a second interval boundary threshold value of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold;
generating a data inspection script according to the first interval boundary threshold, the second interval boundary threshold, the data inspection parameters and a second script template;
and checking whether the fluctuation of the real-time data meeting the data checking parameters is normal or not through the data checking script.
2. The method of claim 1, wherein generating a first interval boundary threshold and a second interval boundary threshold for a data fluctuation interval from the first historical data comprises:
preprocessing and stabilizing the first historical data to obtain second historical data;
obtaining a plurality of prediction data corresponding to the second historical data according to the second historical data and a time sequence prediction model;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to a plurality of prediction data corresponding to the second historical data.
3. The method of claim 1, wherein the data inspection parameters include a target table name, a target field name, and a target data inspection period; before generating the historical data acquisition script according to the data inspection parameters and the first script template input by the user, the method further comprises the following steps:
reading pre-stored table structure information; the table structure information comprises at least one table name and at least one field name corresponding to each table name;
generating a configuration page of the data checking parameters according to the table structure information;
Receiving, at the configuration page, a first selection operation submitted for the target table name, a second selection operation submitted for the target field name, a third selection operation submitted for the data check period, and a configuration validation operation;
and determining a target table name, a target field name and a target data checking period which are input by a user according to the first selection operation, the second selection operation, the third selection operation and the configuration confirmation operation.
4. The method according to claim 1, wherein checking whether the volatility of the real-time data satisfying the data checking parameters is normal by the data checking script comprises:
running the data checking script to obtain a difference value between the real-time data and the first interval boundary threshold value as a first difference value, and obtaining a difference value between the real-time data and the second interval boundary threshold value as a second difference value;
if the first difference value is greater than or equal to zero and the second difference value is less than or equal to zero, determining that the fluctuation of the real-time data is normal;
if the first difference value is smaller than zero and the second difference value is smaller than zero, determining that the fluctuation of the real-time data is abnormal;
And if the first difference value is larger than zero and the second difference value is larger than zero, determining that the fluctuation of the real-time data is abnormal.
5. The method according to claim 1, further comprising, after checking whether or not the volatility of the real-time data satisfying the data checking parameters is normal by the data checking script:
if the fluctuation of the real-time data is determined to be normal, determining that the auditing result of the real-time data is normal;
if the fluctuation of the real-time data is abnormal, determining that the auditing result of the real-time data is abnormal;
and generating an alarm notification for the real-time data with abnormal auditing results.
6. The method of claim 2, wherein calculating a first interval boundary threshold and a second interval boundary threshold for the data fluctuation interval from the plurality of prediction data corresponding to the second history data comprises:
calculating to obtain the average absolute error of the predicted data according to a plurality of predicted data corresponding to the second historical data;
determining a highest value and a lowest value of the plurality of prediction data;
and calculating a first interval boundary threshold value and a second interval boundary threshold value of the data fluctuation interval according to the average absolute error, the highest value and the lowest value.
7. The method of claim 2, wherein preprocessing and smoothing the first history data to obtain second history data comprises:
preprocessing the first historical data;
judging whether the preprocessed first historical data has stationarity or not by a unit root checking method;
if yes, determining the preprocessed first historical data as second historical data;
if not, carrying out differential processing on the preprocessed first historical data until the first historical data after differential processing is determined to have stationarity, and determining the first historical data with stationarity after differential processing as second historical data.
8. An inspection apparatus for data fluctuation, comprising:
the first script generation module is used for generating a historical data acquisition script according to the data inspection parameters and the first script template input by the user;
the boundary threshold generating module is used for obtaining first historical data meeting the data inspection parameters through the historical data acquisition script and generating a first interval boundary threshold and a second interval boundary threshold of a data fluctuation interval according to the first historical data; the first interval boundary threshold is less than the second interval boundary threshold;
The second script generation module is used for generating a data inspection script according to the first interval boundary threshold value, the second interval boundary threshold value, the data inspection parameters and the second script template;
and the volatility checking module is used for checking whether the volatility of the real-time data meeting the data checking parameters is normal or not through the data checking script.
9. An electronic device comprising a memory and a processor, the memory having stored thereon computer executable instructions which, when executed by the processor, enable the method of checking for data volatility according to any one of claims 1-7.
10. A storage medium having stored therein computer executable instructions which, when executed by a processor, enable the method of checking for data volatility according to any one of claims 1 to 7.
CN202111210569.7A 2021-10-18 2021-10-18 Method and device for checking data fluctuation, electronic equipment and storage medium Pending CN115994168A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111210569.7A CN115994168A (en) 2021-10-18 2021-10-18 Method and device for checking data fluctuation, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111210569.7A CN115994168A (en) 2021-10-18 2021-10-18 Method and device for checking data fluctuation, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115994168A true CN115994168A (en) 2023-04-21

Family

ID=85989021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111210569.7A Pending CN115994168A (en) 2021-10-18 2021-10-18 Method and device for checking data fluctuation, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115994168A (en)

Similar Documents

Publication Publication Date Title
CN110880984A (en) Model-based flow anomaly monitoring method, device, equipment and storage medium
CN111309539A (en) Abnormity monitoring method and device and electronic equipment
CN112418921A (en) Power demand prediction method, device, system and computer storage medium
CN107944005B (en) Data display method and device
CN112862593B (en) Credit scoring card model training method, device and system and computer storage medium
CN115204536A (en) Building equipment fault prediction method, device, equipment and storage medium
CN113868953A (en) Multi-unit operation optimization method, device and system in industrial system and storage medium
CN114202256B (en) Architecture upgrading early warning method and device, intelligent terminal and readable storage medium
CN117540826A (en) Optimization method and device of machine learning model, electronic equipment and storage medium
CN111612149A (en) Main network line state detection method, system and medium based on decision tree
CN111340287A (en) Power distribution cabinet operation state prediction method and device
CN113723747A (en) Analysis report generation method, electronic device and readable storage medium
CN117291576A (en) Industrial scene data trend prediction-based method, system, computer equipment and storage medium
CN111652712A (en) Pre-credit analysis method, device, equipment and storage medium based on geographic information
CN115994168A (en) Method and device for checking data fluctuation, electronic equipment and storage medium
CN116383645A (en) Intelligent system health degree monitoring and evaluating method based on anomaly detection
CN116126807A (en) Log analysis method and related device
CN115935284A (en) Power grid abnormal voltage detection method, device, equipment and storage medium
CN115689331A (en) Power transmission and transformation project quantity rationality analysis method based on MLP
CN114283344A (en) Automatic real-time monitoring method and system for forest ecological hydrological process
CN110458383B (en) Method and device for realizing demand processing servitization, computer equipment and storage medium
CN111565121A (en) Method and device for evaluating IT (information technology) support degree by personnel and technology
CN111680572A (en) Power grid operation scene dynamic judgment method and system
CN118035527B (en) Interactive data processing method, medium and equipment for business and resource
WO2023181230A1 (en) Model analysis device, model analysis method, and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination