CN114926082A

CN114926082A - Artificial intelligence-based data fluctuation early warning method and related equipment

Info

Publication number: CN114926082A
Application number: CN202210641918.9A
Authority: CN
Inventors: 原鹏飞
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2022-06-07
Filing date: 2022-06-07
Publication date: 2022-08-19

Abstract

The application provides a data fluctuation early warning method and device based on artificial intelligence, electronic equipment and a storage medium, wherein the data fluctuation early warning method based on artificial intelligence comprises the following steps: collecting data according to a preset data sampling time point to serve as test data; dividing the test data according to a user-defined index analysis method to obtain training data; interpolating the training data to obtain a data volatility curve; decomposing the data volatility curve to obtain a comprehensive curve set; calculating the distance between the data volatility curve and each comprehensive curve in the comprehensive curve set according to a user-defined distance algorithm, and taking the comprehensive curve corresponding to the minimum distance as an early warning curve; and early warning the data fluctuation based on the early warning curve and a preset fluctuation threshold value. According to the application, the fluctuation of data transmitted by the data application can be predicted by counting the test data after the data application is on line, so that the timeliness of data application early warning is improved.

Description

Artificial intelligence-based data fluctuation early warning method and related equipment

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to an artificial intelligence-based data fluctuation early warning method and apparatus, an electronic device, and a storage medium.

Background

Most data applications on the market currently have various types of data displays, such as annual interest rate trend chart of financial products and other functional pages related to detailed display of financial revenue business indexes, however, the data amount contained in the functional pages is usually large, and the data changes continuously with the time.

At present, an alarm method for data application generally judges whether to alarm the data application according to error reporting information of the data application, and because the data application has a fault at the moment, the timeliness of the existing alarm mode is low, and timely early warning may not be achieved.

Disclosure of Invention

In view of the foregoing, it is necessary to provide a data fluctuation warning method based on artificial intelligence and related devices, so as to solve the technical problem of how to improve the timeliness of data fluctuation warning, where the related devices include a data fluctuation warning apparatus based on artificial intelligence, an electronic device, and a storage medium.

The application provides a data fluctuation early warning method based on artificial intelligence, which comprises the following steps:

collecting data according to a preset data sampling time point to serve as test data;

dividing the test data according to a user-defined index analysis method to obtain training data;

interpolating the training data to obtain a data volatility curve;

decomposing the data volatility curve to obtain a comprehensive curve set;

calculating the distance between the data volatility curve and each comprehensive curve in the comprehensive curve set according to a user-defined distance algorithm, and taking the comprehensive curve corresponding to the minimum distance as an early warning curve;

and early warning the data fluctuation based on the early warning curve and a preset fluctuation threshold value.

According to the scheme, before the data application is on line, each functional module is called based on the data sampling time point to obtain the test data set, the early warning curve of the data application is obtained according to the preset interpolation and decomposition method, the data application is early warned according to the preset threshold value, the data application is not required to be waited for error reporting, and the timeliness of the early warning of the data application is improved.

In some optional embodiments, the dividing the test data according to a custom metric analysis method to obtain training data includes:

constructing a covariance matrix of the test data to obtain a volatility index of the test data;

calculating the weight of the volatility index to obtain an updated volatility index;

and dividing the test data according to the updated volatility index and a preset condition to obtain the training data.

Therefore, the functional interface is called according to the preset timing task in the data application life cycle to obtain the test data, the test data comprises the time point of data acquisition, the name of the called interface and the quantitative information in the test data, and data support is provided for the subsequent data fluctuation analysis.

In some embodiments, the calculating the weight of the volatility index to obtain the updated volatility index comprises:

calculating the chaos degree of the volatility index, wherein the chaos degree is used for representing the smoothness degree of the volatility index;

calculating the weight of the volatility index based on the chaos and a custom piecewise function;

and updating the volatility index according to the weight to obtain an updated volatility index.

Therefore, the weight of the volatility index is obtained by calculating the stability degree of the volatility index, the volatility index is updated based on the weight, the influence of the stability degree of the volatility index on the volatility index is considered, the confidence coefficient of the belonged volatility index can be improved, and the accuracy of a subsequent data analysis result can be improved.

In some embodiments, the interpolating the training data to obtain a data volatility curve comprises:

splitting the training data to obtain a set of training data intervals;

fitting data in the training data interval set according to a preset interpolation function to construct a function curve set;

and combining the function curve sets to obtain a data volatility curve.

Therefore, a plurality of interval sets are obtained by splitting discrete training data, a cubic function curve in each interval is fitted by utilizing the cubic function, discrete data points in a time sequence are converted into continuous curve data, the data volume can be expanded, and data support is provided for subsequent screening of the early warning curve.

In some embodiments, said decomposing said data volatility curve to obtain a set of synthesis curves comprises:

marking a maximum value point and a minimum value point in the data volatility curve, and fitting the maximum value point and the minimum value point by using a preset interpolation function to obtain an upper envelope line and a lower envelope line;

b, calculating the mean value of the upper envelope line and the lower envelope line to be used as a mean value curve;

c, calculating a difference value between the data volatility curve and the mean value curve to serve as an updating volatility curve;

d, if the updating volatility curve meets a preset first judgment condition, taking the updating volatility curve as the data volatility curve, and repeating the steps a-c, otherwise, taking the updating volatility curve as a comprehensive curve and performing the step e;

e, calculating the difference value between the data volatility curve and the comprehensive curve to obtain a data sequence;

and f, if the data sequence meets a preset second judgment condition, taking the data sequence as a data volatility curve and repeating the steps a-f, otherwise, storing all obtained comprehensive curves as a comprehensive curve set.

Therefore, the data volatility curve is decomposed through the iterative algorithm, a comprehensive curve set is obtained, each comprehensive curve can be used for representing partial characteristics of the data volatility curve, data support can be provided for subsequent screening and early warning curves, and therefore early warning accuracy is improved.

In some embodiments, said calculating distance data between said data volatility curve and each synthetic curve in said set of synthetic curves according to a custom distance algorithm comprises:

collecting a first data point in the data volatility curve according to a preset sampling frequency;

acquiring a second data point in the corresponding comprehensive curve according to the preset sampling frequency;

and calculating the mean value of the Euclidean distance between the first data point and the second data point to be used as distance data.

Therefore, the early warning curve is screened out by calculating the distance between the comprehensive curve and the data volatility curve, and the early warning curve has the closest distance to the data volatility curve, namely the most similar characteristic, so that the volatility characteristic of the data can be represented more accurately.

In some embodiments, the pre-warning of data fluctuation based on the pre-warning curve and a preset fluctuation threshold includes:

calculating the distance between the real-time monitored data fluctuation curve and the corresponding early warning curve to obtain the curve fluctuation distance;

and comparing the curve fluctuation distance with a preset fluctuation threshold value, and if the curve fluctuation distance is greater than the fluctuation threshold value, sending out early warning information.

Therefore, before the data application is on line, a test data set is collected based on the data sampling time point, an early warning curve of the data application is obtained according to a preset interpolation and decomposition method, the data application is early warned according to a preset threshold value, the data application is not required to be waited for error reporting, and the timeliness of the early warning of the data application is improved.

The embodiment of the present application further provides a data fluctuation early warning device based on artificial intelligence, the device includes:

the acquisition unit is used for acquiring data as test data according to a preset data sampling time point;

the dividing unit is used for dividing the test data according to a user-defined index analysis method to obtain training data;

the interpolation unit is used for interpolating the training data to obtain a data volatility curve;

the decomposition unit is used for decomposing the data volatility curve to obtain a comprehensive curve set;

the screening unit is used for calculating the distance between the data volatility curve and each comprehensive curve in the comprehensive curve set according to a user-defined distance algorithm, and taking the comprehensive curve corresponding to the minimum distance as an early warning curve;

and the early warning unit is used for early warning data fluctuation based on the early warning curve and a preset fluctuation threshold value.

An embodiment of the present application further provides an electronic device, where the device includes:

a memory storing at least one instruction;

and the processor executes the instructions stored in the memory to realize the artificial intelligence-based data fluctuation early warning method.

The embodiment of the application also provides a computer-readable storage medium, in which at least one instruction is stored, and the at least one instruction is executed by a processor in an electronic device to implement the artificial intelligence-based data fluctuation early warning method.

Drawings

Fig. 1 is a flowchart of a data fluctuation warning method based on artificial intelligence according to a preferred embodiment of the present application.

Fig. 2 is a functional block diagram of a preferred embodiment of an artificial intelligence-based data fluctuation warning apparatus according to the present application.

Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the artificial intelligence-based data fluctuation warning method in the present application.

Detailed Description

For a clearer understanding of the objects, features and advantages of the present application, reference is made to the following detailed description of the present application along with the accompanying drawings and specific examples. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict. In the following description, numerous specific details are set forth to provide a thorough understanding of the present application, and the described embodiments are merely a subset of the embodiments of the present application and are not intended to be a complete embodiment.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, features defined as "first" and "second" may explicitly or implicitly include one or more of the described features. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The embodiment of the application provides an artificial intelligence-based data fluctuation early warning method, which can be applied to one or more electronic devices, wherein the electronic devices are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and hardware of the electronic devices includes but is not limited to a microprocessor, an Application Specific Integrated Circuit (ASIC), a programmable gate array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The electronic device may be any electronic product capable of performing human-computer interaction with a client, for example, a personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive network television (IPTV), an intelligent wearable device, and the like.

The electronic device may also include a network device and/or a client device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a cloud computing (cloud computing) based cloud consisting of a large number of hosts or network servers.

The network where the electronic device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a Virtual Private Network (VPN), and the like.

Fig. 1 is a flowchart of a preferred embodiment of the data fluctuation warning method based on artificial intelligence according to the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.

S10, collecting data as test data according to the preset data sampling time point.

In this optional embodiment, a developer may construct a data application according to a preset deployment document, and follow a prompt in the deployment document to formulate a lifecycle and a data update time point of the data application, where the deployment document is in a format of a text document, the content of the text document includes description of a data application structure and codes required for constructing each module in the data application structure, and the function of the deployment document is to guide the developer to deploy the data application as required.

In this alternative embodiment, the lifecycle is a time span from an online time to an offline time of the data application, and a unit of the lifecycle may be a natural day, the data update time point includes an offline data update time point and a real-time data update time point, a change frequency of the offline data update time point is usually once a day, and a change frequency of the real-time data update time point is usually once a second.

In this optional embodiment, the lifecycle T of the data application and the offline data update time point set T may be acquired from the deployment document according to a preset program ¹ Set of real-time data update time points T ² The preset program may be a Python script, and the function of the preset program is to extract data in the deployment document.

In this alternative embodiment, the T may be ¹ And T ² As data sampling time points.

In this alternative embodiment, the time point T may be sampled based on the data within the life cycle T ¹ And T ² The method comprises the steps of running a timing task, wherein the timing task is to call a preset interface to obtain test data, the interface is also called API, the interface is called application programming interface, the API means application program interface, the interface is different components of a predefined software system, and the function of the interface is to provide a set of routines accessed based on certain software or hardware for developers.

In this optional embodiment, the preset interface includes an add, delete, change, and search functional interface of the data application, the preset interface may be in a form of a program written in an SQL language, and the interface functions to run the program written in the SQL language to obtain the test data.

In this alternative embodiment, the test data may be stored in a CSV format, the content of which includes the name, status, format, size of the test data, the test data may be named Log, the first column of the test data may be the variable t, the test data may be named Log, the test data may be constructed based on the size of the test data, the functional interface runtime, and the name of the functional interface, and the test data may be stored in the CSV format ¹ To characterize the time points of data acquisition; the second column may be a variable c to characterize the category of the data sampling time point; the third column may be a variable n, which is used to characterize the category of the function interface called by each data sampling time point; the fourth column may be a variable s to characterize the size of the test data; the fifth column may be the variable t ² To characterize the runtime of the functional interface. The variable c comprises two categories of real-time and offline, and the variable n comprises four categories of adding, deleting, changing and searching.

In this alternative embodiment, the Log named file may be used as the test data.

Therefore, the functional interface is called according to the preset timing task in the data application life cycle to obtain the test data, the test data comprises the time point of data acquisition, the name of the called interface and quantitative information in the test data, and data support is provided for the subsequent data fluctuation analysis.

And S11, dividing the test data according to a user-defined index analysis method to obtain training data.

In an optional embodiment, the dividing the test data according to the custom metric analysis method to obtain the training data includes:

s111, constructing a covariance matrix of the test data to obtain a volatility index of the test data.

In this alternative embodiment, it is possible to base the variables s and t on ² Constructing a covariance matrix M, and calculating an eigenvalue w of the covariance matrix M ¹ 、w ² Exemplarily, if the characteristic value w ¹ The same row as the variance of the variable s in the covariance matrix, the eigenvalue w ¹ Corresponding to the variable s; if the characteristic value w ² With the variable t in the covariance matrix ² Is the same, the eigenvalue w is ² And the variable t ² And (7) correspondingly. Further may be based on the eigenvalues and the variables s and t ² Calculating the volatility index C of the test data in a manner that:

C＝w ¹ ·s+w ² ·t ²

wherein w ¹ 、s、w ² 、t ² The value range of (C) is (0, + ∞), and the volatility index C is used to represent the volatility of the test data size and the volatility of the running time of the corresponding interface.

Exemplarily, when the characteristic value w is ¹ ＝0.5、w ² ＝0.6、s＝10、t ² When the value is 20, the volatility index C of the test data is calculated in the following manner:

C＝0.5×10+0.6×20＝17

the volatility index of the test data takes the value of 17.

In this optional embodiment, the volatility indicator of the test data corresponds to the sampling time point one to one.

In this alternative embodiment, the variable t may be based on ¹ C, n and index C reconstruct the second test data set Log 1.

And S112, calculating the weight of the volatility index to obtain an updated volatility index.

In this optional embodiment, the calculating the weight of the volatility index to obtain the updated volatility index includes:

In this optional embodiment, the degree of confusion of the volatility index may be calculated according to an information entropy algorithm, where the information entropy algorithm is specifically implemented by, when a value of a variable c in a piece of data in the test data set Log1 is "offline", recording that the piece of data belongs to the set data ¹ When the value of the variable c of a certain piece of data is 'real-time', the data is recorded to belong to the set data ² Said data being data ¹ And data ² Each piece of data in (1) contains four variables, i.e., t ¹ C, n and index C. Calculating the data ¹ And data ² Information entropy E of middle index C ¹ And E ² To data with ¹ Corresponding information entropy E ¹ For example, the calculation method is as follows:

wherein E is ¹ Represents the data ¹ The information entropy of the medium index C is,

represents the data ¹ The ith C index in (1), z representsThe data ¹ The number of entries of the data in (b),

represents the data ¹ The probability of occurrence of the ith C index.

Illustratively, when said z is 3,

When said data is received ¹ The corresponding information entropy calculation mode is as follows:

E ¹ ＝-[0.3×log(0.3)+0.2×log(0.2)+0.5×log(0.5)]＝0.45

then the data is ¹ The corresponding information entropy value is 0.45, then the data is obtained ¹ The degree of misordering of the data in (1) was 0.45.

In this optional embodiment, the custom piecewise function satisfies the following relation:

wherein (1-E) ¹ ) And (1-E) ² ) The weight is used for representing the volatility index and is used for updating the volatility index according to the category of the data sampling time point;

representing the ith updating volatility index;

represents the data ¹ The ith volatility index of (a),

represents the data ² The ith volatility index in (c) represents the class of the data sampling time point.

In this alternative embodiment, the data may be encoded ¹ And data ² All the test data volatility indexes C in the test data are replaced by corresponding updated volatility indexes,to update the data ¹ And data ² 。

And S113, dividing the test data according to the updated volatility index and a preset condition to obtain the training data.

In this optional embodiment, the updated data may be parsed according to a preset condition ¹ And data ² Obtaining a test data volatility set corresponding to each functional module in the data application, wherein the test data volatility set corresponding to each functional module comprises data ^Increase 、data ^{Deleting door} 、data ^{Improvement of} 、data ^{Check the} The preset conditions include n-add, n-delete, n-change and n-check, and the specific steps of the analysis are as follows:

a1: if n is increased, the data belongs to the data ^Increase ；

A2: if n is deleted, the piece of data belongs to the data ^{Deleting door} ；

A3: if n is changed, the piece of data belongs to the data ^{Improvement of} ；

A4: if n is found, the data belongs to the data ^{Search of} 。

In this optional embodiment, a first column variable in the volatility set of the test data corresponding to each functional module in the data application is t ¹ The second row variable C ^new 。

Data to be obtained in the scheme ^Increase 、data ^{Deleting door} 、data ^{Improvement of} And data ^{Search of} As the training data.

And S12, interpolating the training data to obtain a data volatility curve.

In an optional embodiment, the interpolating the training data to obtain the data volatility curve includes:

and S121, splitting the training data to obtain a training data interval set.

In this alternative embodiment, the data is used ^Increase For example, the data may be used ^{Increase the} The data points in the data are divided into n intervals, wherein one interval is arranged between every two data points, and each interval is connected end to end.

And S122, fitting the data in the training data interval set according to a preset interpolation function to construct a function curve set.

In this optional embodiment, the preset interpolation function may be a cubic function, and the fitting of the data in the training data interval set is implemented in such a way that, in each interval, the data in the interval needs to satisfy a cubic equation f (x) _i )＝y _i ＝a _i +b _i ·x _i +c _i ·x _i ² +d _i ·x _i ³ Wherein y is _i Characterize the ith update volatility index C ^new Value of (a), x _i Characterising the ith variable t ¹ The value of (c). The cubic spline function should satisfy the following three conditions: all points need to satisfy the interpolation condition; the first derivative and the second derivative of the n-1 inner points are continuous; the second derivative of the two endpoints is designated 0.

In this alternative embodiment, the cubic function may be solved according to the three conditions to obtain the coefficient a of the cubic equation _i 、b _i 、c _i 、d _i If so, the curve corresponding to the cubic function is the data ^{Increase the} As a function of each interval.

In this alternative embodiment, the data may be separately mapped ^Increase 、data ^{Deleting door} 、data ^{Improvement of} 、data ^{Check the} The function curve of each interval in the data set is used as a corresponding function curve set of each data set.

And S123, combining the function curve sets to obtain a data volatility curve.

In this alternative embodiment, the data may be sampled at the data sampling time points ^{Increase the} Function of each interval in corresponding function curve setThe curves are connected end to end, and the combination is used as the index C ^new The data fluctuation curve of (2).

In this embodiment, the data may be recorded ^Increase Index C in ^new The Curve of the data volatility of (1) is Curve ^Increase Record data ^{Deleting device} C in ^new The data fluctuation Curve of (1) is Curve ^{Deleting door} ，data ^{In a modified way} C in ^new The Curve of the data volatility of (1) is Curve ^{Improvement of} ，data ^{Check the} C in ^new The Curve of the data volatility of (1) is Curve ^{Check the} 。

Therefore, a plurality of interval sets are obtained by splitting discrete training data, a cubic function curve in each interval is fitted by utilizing a cubic function, discrete data points in a time sequence are converted into continuous curve data, the data volume can be expanded, and data support is provided for subsequent screening early warning curves.

And S13, decomposing the data volatility curve to obtain a comprehensive curve set.

In this optional embodiment, decomposing the data volatility curve to obtain a comprehensive curve set includes:

b, calculating the average value of the upper envelope line and the lower envelope line to be used as an average value curve;

c, calculating the difference value of the data volatility curve and the mean value curve to serve as an updating volatility curve;

d, if the updated volatility curve meets a preset first judgment condition, taking the updated volatility curve as a data volatility curve, and repeating the steps a-c, otherwise, taking the updated volatility curve as a comprehensive curve and performing the step e;

and f, if the data sequence meets a preset second judgment condition, taking the data sequence as a data volatility curve, repeating the steps a-f, otherwise, storing all obtained comprehensive curves as a comprehensive curve set, and taking the data sequence as a residual error curve.

In this alternative embodiment, the Curve is used ^{Increase the} For example, the specific implementation steps for decomposing the data volatility curve are as follows:

a1: with Curve ^Increase Marking all maximum value points as an original data sequence, and fitting all the maximum value points by using the preset interpolation function to form an upper envelope line of the original sequence;

a2: marking all minimum value points of the original sequence, and fitting all the minimum value points by the cubic spline interpolation method to form a lower envelope curve of the original sequence;

a3: recording the average value sequence of the upper envelope line and the lower envelope line as ml, and subtracting the average value sequence ml from the original data sequence to obtain a new data sequence h;

a4: if a negative local maximum and a positive local minimum exist in h, taking h as an original data sequence and repeating the steps A1, A2 and A3 until no negative local maximum and positive local minimum exist in h, wherein h is used as a first comprehensive curve;

a5: using currve ^{Increase the} Subtracting the first comprehensive curve h to obtain a new data sequence h 1;

a6: if the number of the extreme points of h1 is more than 2, h1 is used as the original data sequence, and the steps A1 to A4 are repeated to obtain a second comprehensive curve, otherwise, the iteration is terminated, and h1 is the residual curve.

In this optional embodiment, let each functional module in the data application have e synthetic curves, which respectively form a synthetic curve set, and record as

Therefore, the data volatility curve is decomposed through the iterative algorithm, and comprehensive curve sets are obtained, wherein each comprehensive curve set can be used for representing partial characteristics of the data volatility curve, data support can be provided for subsequent screening of the early warning curve, and therefore the early warning accuracy is improved.

And S14, calculating the distance between the data volatility curve and each comprehensive curve in the comprehensive curve set according to a user-defined distance algorithm, and taking the comprehensive curve corresponding to the minimum distance as an early warning curve.

The calculating the distance data between the data volatility curve and each comprehensive curve in the comprehensive curve set according to the user-defined distance algorithm comprises the following steps:

and S141, collecting a first data point in the data volatility curve according to a preset sampling frequency.

In this alternative embodiment, the preset sampling frequency may be determined according to a life cycle of the data application, and for example, the preset sampling frequency in this embodiment may be 1HZ, that is, sampling is performed once per second.

In this alternative embodiment, the start point of the data volatility curve may be a sampling start point, and a value on the data volatility curve is collected based on the preset sampling frequency to serve as the first data point d 1.

And S142, acquiring a second data point in the corresponding comprehensive curve according to the preset sampling frequency.

In this alternative embodiment, the start point of each synthetic curve may be used as the sampling start point, and the value on the synthetic curve is collected based on the preset sampling frequency to serve as the second data point d 2.

And S143, calculating an average value of Euclidean distances between the first data point and the second data point to serve as distance data.

In this alternative embodiment, the Euclidean distance between the values collected at the same time points in d1 and d2 may be calculated and recorded as d to construct data set d 3.

In this alternative embodiment, the mean of all the data in D3 may be calculated and taken as D.

In this alternative embodiment, the distance between each synthetic curve and the corresponding data volatility curve may be calculated based on the above method and recorded as

Where i ∈ { add, delete, change, check }, j ∈ {1,2, …, e }, where

The distance between the jth composite curve representing the functional module with index i and the original test data volatility curve.

In this alternative embodiment, it is possible to base the process on the description

Selecting a comprehensive curve with the minimum distance from a data volatility curve in each functional module of the data application as an early warning curve of the corresponding module, for example, when i is increased, if i is increased

Is composed of

The minimum value in the above formula is the early warning curve of the increasing module

And is recorded as

In this alternative embodiment, the description may be

And

as the early warning curve.

Therefore, the early warning curve is screened out by calculating the distance between the comprehensive curve and the corresponding data volatility curve, and the early warning curve has the closest distance to the data volatility curve, namely the most similar characteristic, so that the volatility characteristic of the data can be represented more accurately.

And S15, pre-warning data fluctuation based on the pre-warning curve and a preset fluctuation threshold value.

In an optional embodiment, the pre-warning the data fluctuation based on the pre-warning curve and a preset fluctuation threshold includes:

and S151, calculating the distance between the real-time monitored data fluctuation curve and the corresponding early warning curve to obtain the curve fluctuation distance.

In this optional embodiment, a curve fluctuation distance between a test data fluctuation curve of each functional module after the data application is online and an early warning curve corresponding to each module may be calculated according to the customized distance measurement method, and is recorded as D _test 。

S152, comparing the curve fluctuation distance with a preset fluctuation threshold value, and if the curve fluctuation distance is larger than the fluctuation threshold value, sending out early warning information.

Comparing a preset threshold value with the D _test To determine whether to perform an early warning on each module, for example, the preset threshold may be 0.8, and the comparing method includes:

therefore, whether early warning is carried out or not is judged through the preset threshold value, and early warning decision can be more flexible.

Referring to fig. 2, fig. 2 is a functional block diagram of a preferred embodiment of the data fluctuation warning apparatus based on artificial intelligence according to the present invention. The artificial intelligence-based data fluctuation early warning device 11 comprises an acquisition unit 110, a dividing unit 111, an interpolation unit 112, a decomposition unit 113, a screening unit 114 and an early warning unit 115. A module/unit as referred to herein is a series of computer readable instruction segments capable of being executed by the processor 13 and performing a fixed function, and is stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.

In an alternative embodiment, the obtaining unit 110 is configured to collect data as the test data according to a preset data sampling time point.

In this alternative embodiment, a developer may construct a data application according to a preset deployment document, and follow a prompt in the deployment document to formulate a lifecycle and a data update time point of the data application, where the deployment document is a text document, its content includes a description of a data application structure and codes required for constructing each module in the data application structure, and its function is to guide the developer to deploy the data application as required.

In this alternative embodiment, the lifecycle is a time span from an online time to an offline time of the data application, and the unit of the time span may be a natural day, the data update time point includes an offline data update time point and a real-time data update time point, the change frequency of the offline data update time point is usually once per day, and the change frequency of the real-time data update time point is usually once per second.

In this optional embodiment, the lifecycle T of the data application and the offline data update time point set T may be acquired from the deployment document according to a preset program ¹ And a set of real-time data updating time points T ² The preset program may be a Python script, and the function of the preset program is to extract data in the deployment document.

In this alternative embodiment, the time point T may be sampled based on the data within the life cycle T ¹ And T ² Running a timing task, wherein the timing task is to call a preset interface to obtainThe interface, also called API, is an application programming interface, meaning an application program interface, is a distinct component of a predefined software system whose function is to provide a developer with a set of routines that can be accessed based on certain software or hardware.

In this optional embodiment, the preset interface includes an add-delete, change-modify-check function interface of the data application, the preset interface may be in a form of a program written in an SQL language, and the function of the interface is to run the program written in the SQL language to obtain the test data.

In this alternative embodiment, the test data may be stored in a CSV format, the content of which includes the name, status, format, size of the test data, the test data may be named Log, the first column of the test data may be the variable t, the test data may be named Log, the test data may be constructed based on the test data size, the functional interface runtime, and the name of the functional interface, and the test data may be stored in a CSV format ¹ To characterize the time points of data acquisition; the second column may be a variable c to characterize the category of the data sampling time point; the third column may be a variable n, which is used to characterize the category of the function interface called at each data sampling time point; the fourth column may be a variable s to characterize the size of the test data; the fifth column may be the variable t ² To characterize the runtime of the functional interface. The variable c comprises two categories of real-time and offline, and the variable n comprises four categories of adding, deleting, changing and searching.

In an alternative embodiment, the dividing unit 111 is configured to divide the test data according to a custom index analysis method to obtain the training data.

In this alternative embodiment, it is possible to base the variables s and t on ² Constructing a covariance matrix M, and calculating an eigenvalue w of the covariance matrix M ¹ 、w ² Exemplary, if the characteristic value w ¹ The same row as the variance of the variable s in the covariance matrix, the eigenvalue w ¹ Corresponding to the variable s; if the characteristic value w ² With the variable t in the covariance matrix ² Is the same, the eigenvalue w is ² And the variable t ² And (7) corresponding. Further may be based on the eigenvalues and the variables s and t ² Calculating the volatility index C of the test data in a manner of:

C＝w ¹ ·s+w ² ·t ²

wherein, w ¹ 、s、w ² 、t ² Is (0, + ∞), and the volatility index C is used for representing the volatility of the test data size and the volatility of the running time of the corresponding interface.

C＝0.5×10+0.6×20＝17

the volatility index of the test data takes the value of 17.

In this alternative embodiment, it is possible to base on the variable t ¹ C, n and index C reconstruct the second test data set Log 1.

calculating the chaos of the volatility index, wherein the chaos is used for representing the smoothness degree of the volatility index;

In this optional embodiment, the degree of confusion of the volatility index may be calculated according to an information entropy algorithm, where the information entropy algorithm is specifically implemented by, when a value of a variable c in a piece of data in the test data set Log1 is "offline", recording that the piece of data belongs to the set data ¹ When the variable c of a certain piece of data takes the value of 'real time', the piece of data is recorded to belong to the set data ² Said data being data ¹ And data ² Each piece of data in (1) contains four variables, i.e., t ¹ C, n and index C. Calculating the data ¹ And data ² Information entropy E of middle index C ¹ And E ² Data to ¹ Corresponding information entropy E ¹ For example, the calculation method is as follows:

wherein E is ¹ Represents the data ¹ The information entropy of the medium index C,

represents the data ¹ Z represents the data ¹ The number of entries of the data in (c),

represents the data ¹ The probability of occurrence of the ith C index.

Illustratively, when z is 3,

E ¹ ＝-[0.3×log(0.3)+0.2×log(0.2)+0.5×log(0.5)]＝0.45

In this alternative embodiment, the custom piecewise function satisfies the following relation:

representing the ith updating volatility index;

represents the data ¹ The ith volatility index in (1) is,

represents the data ² And c represents the class of the data sampling time point.

In this alternative embodiment, the data may be used ¹ And data ² Replacing all test data volatility indexes C in the test data with corresponding updated volatility indexes to update the data ¹ And data ² 。

In this optional embodiment, the updated data may be parsed according to a preset condition ¹ And data ² Obtaining a test data volatility set corresponding to each functional module in the data application, wherein the test data volatility set corresponding to each functional module comprises data ^Increase 、data ^{Deleting door} 、data ^{Improvement of} 、data ^{Search of} The preset conditions include n-add, n-delete, n-change and n-check, and the specific steps of the analysis are as follows:

a1: if n is increased, the data belongs to the data ^{Increase the} ；

A2: if n is deleted, the piece of data belongs to the data ^{Deleting device} ；

A3: if n is changed, the piece of data belongs to the data ^{In a modified way} ；

A4: if n is found, the data belongs to the data ^{Search of} 。

Data obtained in the scheme ^Increase 、data ^{Deleting door} 、data ^{In a modified way} And data ^{Check the} As the training data.

In an alternative embodiment, the interpolation unit 112 is configured to interpolate the training data to obtain a data volatility curve.

splitting the training data to obtain a training data interval set;

and combining the function curve sets to obtain a data volatility curve.

In this alternative embodiment, the data is used ^{Increase the} For example, the data may be used ^Increase The data points in the data are divided into n intervals, wherein one interval is arranged between every two data points, and each interval is connected end to end.

In this alternative embodiment, the preset interpolation function may be a cubic function, and the implementation manner of fitting the data in the training data interval set is that, in each interval, the data in the interval needs to satisfy a cubic equation f (x) _i )＝y _i ＝a _i +b _i ·x _i +c _i ·x _i ² +d _i ·x _i ³ Wherein y is _i Characterize the ith update volatility index C ^new Value of (a), x _i Characterising the ith variable t ¹ The value of (c). The cubic spline function should satisfy the following three conditions: all points need to satisfy the interpolation condition; the first derivative and the second derivative of the n-1 inner points are continuous; the second derivative of the two endpoints is designated 0.

In this alternative embodiment, the cubic function may be solved according to the three conditions to obtain the coefficient a of the cubic equation _i 、b _i 、c _i 、d _i If so, the curve corresponding to the cubic function is the data ^Increase As a function of each interval.

In this alternative embodiment, the data may be separately mapped ^{Increase the} 、data ^{Deleting device} 、data ^{Improvement of} 、data ^{Search of} The function curve of each interval is used as the function curve set corresponding to each data set.

In this alternative embodiment, the data may be sampled at the data sampling time points ^Increase The function curves of each interval in the corresponding function curve set are connected end to form the index C ^new The data fluctuation curve of (2).

In this embodiment, the data can be recorded ^{Increase the} Index C in ^new The data fluctuation Curve of (1) is Curve ^Increase Recording data ^{Deleting door} C in ^new The data fluctuation Curve of (1) is Curve ^{Deleting door} ，data ^{In a modified way} C in ^new The Curve of the data volatility of (1) is Curve ^{Improvement of} ，data ^{Check the} C in ^new The data fluctuation Curve of (1) is Curve ^{Check the} 。

In an alternative embodiment, the decomposition unit 113 is configured to decompose the data volatility curve to obtain a set of synthesis curves.

and f, if the data sequence meets a preset second judgment condition, taking the data sequence as a data volatility curve, and repeating the steps a-f, otherwise, storing all obtained comprehensive curves as a comprehensive curve set.

a2: marking all minimum value points of the original sequence, and fitting all the minimum value points through the cubic spline interpolation method to form a lower envelope line of the original sequence;

a3: recording the mean value sequence of the upper envelope line and the lower envelope line as ml, and subtracting the mean value sequence ml from the original data sequence to obtain a new data sequence h;

a4: if negative local maxima and positive local minima exist in h, taking h as an original data sequence, and repeating the steps A1, A2 and A3 until negative local maxima and positive local minima do not exist in h, wherein h is taken as a first comprehensive curve;

a5: using currve ^Increase Subtracting the first comprehensive curve h to obtain a new data sequence h 1;

a6: if the number of the extreme points of h1 is more than 2, h1 is used as an original data sequence, and the steps A1 to A4 are repeated to obtain a second comprehensive curve, otherwise, the iteration is terminated, and h1 is a residual curve.

In an optional embodiment, the screening unit 114 is configured to calculate a distance between the data volatility curve and each synthetic curve in the synthetic curve set according to a custom distance algorithm, and use the synthetic curve corresponding to the minimum distance as the early warning curve.

In this optional embodiment, the calculating, according to a user-defined distance algorithm, the distance data between the data volatility curve and each synthetic curve in the synthetic curve set includes:

In this alternative embodiment, the start point of each synthetic curve may be a sampling start point, and the value on the synthetic curve is collected based on the preset sampling frequency to serve as the second data point d 2.

Where i ∈ { add, delete, change, check }, j ∈ {1,2, …, e }, where

In this alternative embodiment, it is possible to base on the description

Selecting a comprehensive curve with the minimum distance from the data volatility curve in each functional module of the data application as the early warning curve of the corresponding module, for example, when i is increased, if i is increased

Is composed of

The minimum value in the sum is that the early warning curve of the increasing module is

And is marked as

In this alternative embodiment, the

And

as the pre-warning curve.

In an optional embodiment, the early warning unit 115 is configured to perform early warning on data fluctuation based on the early warning curve and a preset fluctuation threshold.

In an optional embodiment, the pre-warning of data fluctuation based on the pre-warning curve and a preset fluctuation threshold includes:

before the data application is on line, each function module is called based on the data sampling time point to obtain a test data set, an early warning curve of the data application is obtained according to a preset analysis method and an analysis method, the data application is early warned according to a preset threshold value, the data application is not required to be waited for error reporting, and the timeliness of early warning of the data application is improved.

Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 1 comprises a memory 12 and a processor 13. The memory 12 is used for storing computer readable instructions, and the processor 13 is used for executing the computer readable instructions stored in the memory to implement the artificial intelligence based data fluctuation warning method according to any one of the above embodiments.

In an alternative embodiment, the electronic device 1 further comprises a bus, a computer program stored in said memory 12 and executable on said processor 13, such as an artificial intelligence based data fluctuation warning program.

Fig. 3 shows only the electronic device 1 with the memory 12 and the processor 13, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.

Referring to fig. 1, the memory 12 in the electronic device 1 stores a plurality of computer-readable instructions to implement an artificial intelligence-based data fluctuation warning method, and the processor 13 can execute the plurality of instructions to implement:

interpolating the training data to obtain a data volatility curve;

decomposing the data volatility curve to obtain a comprehensive curve set;

Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the instruction, which is not described herein again.

It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the electronic device 1, and does not constitute a limitation to the electronic device 1, the electronic device 1 may have a bus-type structure or a star-shaped structure, and the electronic device 1 may further include more or less hardware or software than that shown in the figure, or different component arrangements, for example, the electronic device 1 may further include an input and output device, a network access device, and the like.

It should be noted that the electronic device 1 is only an example, and other existing or future electronic products, such as those that may be adapted to the present application, should also be included in the scope of protection of the present application, and are included by reference.

Memory 12 includes at least one type of readable storage medium, which may be non-volatile or volatile. The readable storage medium includes flash memory, removable hard disks, multimedia cards, card type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disks, optical disks, etc. The memory 12 may in some embodiments be an internal storage unit of the electronic device 1, e.g. a removable hard disk of the electronic device 1. The memory 12 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) card, a flash memory card (FlashCard), and the like, provided on the electronic device 1. The memory 12 may be used not only to store application software installed in the electronic device 1 and various types of data, such as codes of an artificial intelligence-based data fluctuation warning program, but also to temporarily store data that has been output or is to be output.

The processor 13 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital processing chips, graphics processors, and combinations of various control chips. The processor 13 is a control unit (control unit) of the electronic device 1, connects various components of the whole electronic device 1 by using various interfaces and lines, and executes various functions of the electronic device 1 and processes data by running or executing programs or modules (for example, executing a data fluctuation warning program based on artificial intelligence, etc.) stored in the memory 12 and calling data stored in the memory 12.

The processor 13 executes an operating system of the electronic device 1 and various installed application programs. The processor 13 executes the application program to implement the steps of the above-mentioned embodiments of artificial intelligence-based data fluctuation warning method, such as the steps shown in fig. 1.

Illustratively, the computer program may be partitioned into one or more modules/units, which are stored in the memory 12 and executed by the processor 13 to accomplish the present application. The one or more modules/units may be a series of computer-readable instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the electronic device 1. For example, the computer program may be divided into an acquisition unit 110, a division unit 111, an interpolation unit 112, a decomposition unit 113, a screening unit 114, an early warning unit 115.

The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a computer device, or a network device, etc.) or a processor (processor) to execute parts of the artificial intelligence based data fluctuation warning method according to the embodiments of the present application.

The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the processes in the methods of the embodiments described above may be implemented by a computer program, which may be stored in a computer-readable storage medium and executed by a processor, to implement the steps of the embodiments of the methods described above.

Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U.S. disk, removable hard disk, magnetic diskette, optical disk, computer memory, Read-only memory (ROM), random access memory and other memory, etc.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus. The bus is arranged to enable connection communication between the memory 12 and at least one processor 13 or the like.

The embodiment of the present application further provides a computer-readable storage medium (not shown), where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions are executed by a processor in an electronic device to implement the artificial intelligence based data fluctuation warning method according to any of the above embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the specification may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present application and not for limiting, and although the present application is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made on the technical solutions of the present application without departing from the spirit and scope of the technical solutions of the present application.

Claims

1. A data fluctuation early warning method based on artificial intelligence is characterized by comprising the following steps:

interpolating the training data to obtain a data volatility curve;

decomposing the data volatility curve to obtain a comprehensive curve set;

2. The artificial intelligence based data fluctuation warning method of claim 1, wherein the dividing the test data to obtain training data according to a custom index analysis method comprises:

3. The artificial intelligence based data fluctuation warning method according to claim 2, wherein the calculating the weight of the fluctuation index to obtain the updated fluctuation index comprises:

4. The artificial intelligence based data fluctuation pre-warning method according to claim 1, wherein the interpolating the training data to obtain a data fluctuation curve comprises:

splitting the training data to obtain a set of training data intervals;

and combining the function curve sets to obtain a data volatility curve.

5. The artificial intelligence based data fluctuation pre-warning method as claimed in claim 1, wherein the decomposing the data fluctuation curves to obtain a comprehensive curve set includes:

d, if the updated volatility curve meets a preset first judgment condition, taking the updated volatility curve as the data fluctuation curve, and repeating the steps a-c, otherwise, taking the updated volatility curve as a comprehensive curve and carrying out the step e;

6. The artificial intelligence based data fluctuation warning method of claim 1, wherein the calculating distance data between the data fluctuation curve and each synthetic curve in the synthetic curve set according to a custom distance algorithm comprises:

7. The artificial intelligence based data fluctuation early warning method as claimed in claim 1, wherein the early warning of data fluctuation based on the early warning curve and a preset fluctuation threshold value comprises:

8. A data fluctuation early warning device based on artificial intelligence is characterized in that the device comprises:

the device comprises an acquisition unit, a data processing unit and a data processing unit, wherein the acquisition unit is used for acquiring data as test data according to a preset data sampling time point;

9. An electronic device, characterized in that the device comprises:

a memory storing computer readable instructions; and

a processor executing computer readable instructions stored in the memory to implement the artificial intelligence based data fluctuation alert method as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium having computer-readable instructions stored thereon, which when executed by a processor implement the artificial intelligence based data fluctuation warning method according to any one of claims 1 to 7.