CN114118676A - Data analysis method and device based on multi-type sensing equipment and readable medium - Google Patents

Data analysis method and device based on multi-type sensing equipment and readable medium Download PDF

Info

Publication number
CN114118676A
CN114118676A CN202111186763.6A CN202111186763A CN114118676A CN 114118676 A CN114118676 A CN 114118676A CN 202111186763 A CN202111186763 A CN 202111186763A CN 114118676 A CN114118676 A CN 114118676A
Authority
CN
China
Prior art keywords
data
time
equipment
moment
predicted value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111186763.6A
Other languages
Chinese (zh)
Inventor
于显浩
张显羽
骆晓锋
邱伟
黄文龙
游秋森
范永学
宋文志
丁磊
庄斌
王晓波
杨宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Xiamen Pumped Storage Co ltd
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Xinyuan Co Ltd
Beijing Guodiantong Network Technology Co Ltd
Original Assignee
Fujian Xiamen Pumped Storage Co ltd
State Grid Corp of China SGCC
State Grid Information and Telecommunication Co Ltd
State Grid Xinyuan Co Ltd
Beijing Guodiantong Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Xiamen Pumped Storage Co ltd, State Grid Corp of China SGCC, State Grid Information and Telecommunication Co Ltd, State Grid Xinyuan Co Ltd, Beijing Guodiantong Network Technology Co Ltd filed Critical Fujian Xiamen Pumped Storage Co ltd
Priority to CN202111186763.6A priority Critical patent/CN114118676A/en
Publication of CN114118676A publication Critical patent/CN114118676A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Educational Administration (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data analysis method, a data analysis device and a readable medium based on multi-type sensing equipment, which are applied to an analysis scene of returned data of multi-equipment in a capital construction field of a pumped storage power station, construct a method system for analyzing and prejudging equipment feedback data, and complete full life cycle monitoring and anomaly detection of any equipment. The processes of edge calculation, data acquisition, data management, model prediction and the like are connected in series, and a standardized flow is provided, so that the method system is reusable and transferable, has good interpretability, completes the standardized management of data through a big data technology, and completes the prediction and monitoring of the equipment state through introducing a Prophet model. The invention has low requirement on data, and can complete the construction of the monitoring link of the whole device only by two dimensions of time and data quantity. The feedback data of the equipment can be utilized to the maximum extent, so that the accuracy of equipment state prediction is further improved.

Description

Data analysis method and device based on multi-type sensing equipment and readable medium
Technical Field
The invention relates to the field of Internet of things, in particular to a data analysis method and device based on multi-type sensing equipment and a readable medium.
Background
In recent years, the technology of the internet of things is increasingly popularized, various internet of things devices emerge like bamboo shoots in spring after raining, and particularly, various types of infrastructure field sensing devices have more advanced hardware foundation and perfect data feedback systems. Due to the diversity of the types and feedback data structures of the equipment and the explosive increase of the data volume, a set of technical scheme for equipment management and control is formed by analyzing and researching the data of the infrastructure field sensing equipment, and the most urgent need is met. Therefore, a standardized process capable of governing, analyzing and modeling the data of the sensing equipment on the capital construction site needs to be constructed, and a universal method for universally managing the data of all the sensing equipment is needed.
In the prior art, a multi-device data analysis and prediction method under a complex scene similar to a multi-type infrastructure site does not exist. In the prior art, no method is available for fusing different types of data, the device is compatible with different types of devices, and the running state prediction of all the devices is completed by using one model. Currently, prediction can only be performed for a specific scene and a single device, and the accuracy of prediction is low.
Disclosure of Invention
The problems of difficult management, difficult state prediction and the like of the multi-type equipment are solved. An embodiment of the present application aims to provide a data analysis method, an apparatus and a readable medium based on multi-type sensing devices to solve the technical problems mentioned in the above background.
In a first aspect, an embodiment of the present application provides a data analysis method based on a multi-type sensing device, including the following steps:
s1, acquiring edge calculation data of the sensing equipment acquired in real time;
s2, preprocessing the edge calculation data to obtain time sequence data;
s3, fitting the normal distribution of the time sequence data to obtain a fitting item;
and S4, inputting the fitting items into the trained Prophet model, outputting a predicted value and a predicted value upper and lower boundary corresponding to the future time, and judging the running state of the sensing equipment according to the actual data and the predicted value upper and lower boundaries.
In some embodiments, step S2 specifically includes:
s21, selecting heartbeat data, equipment running long data and equipment abnormal data from the edge calculation data as basic indexes, and combining the basic indexes to obtain index data;
and S22, performing data cleaning on the basic indexes and the index data to obtain time sequence data.
In some embodiments, step S21 specifically includes:
and merging the basic indexes by adopting a weighted average mode, wherein the formula is as follows:
Gt=α*Ht+β*Dt+γ*At
wherein HtIs heartbeat data; dtRunning long data for the device; a. thetAlpha, beta and gamma are corresponding weighted values, G, respectively, for obtaining equipment stability index according to equipment abnormal datatThe characteristics are weighted for the device at time t.
In some embodiments, step S22 specifically includes:
defining a grid by using a time window, carrying out comprehensive evaluation on repeated data in the grid by adopting a fuzzy synthesis method, and selecting data with the highest score as a unique result of the repeated data;
filling zero padding to the missing data in the grid;
merging data in a grid based on a time grid, wherein the time grid comprises an hour grid and a day grid, and constructing and converting the time grid into a time sequence data triple (device, time, integrator), wherein the device is a unique identifier of equipment, the time is the time grid, and the integrator is an equipment weighting characteristic G at the time ttObtaining time series data X based on the triplett=f(Gt)。
In some embodiments, step S3 specifically includes: and performing binomial difference processing on the time sequence data at the time t and the time t-1 by adopting the following formula to obtain a difference term:
Figure BDA0003299553440000021
wherein G istFor weighting features of the apparatus at time t, Gt-1The device weighting characteristics for time t-1,
Figure BDA0003299553440000022
is a differential term.
In some embodiments, the formula of the Prophet model in step S4 is:
y(t)=g(t)+s(t)+h(t)+εt;
wherein g (t) is a trend term, s (t) is a period term, h (t) is a holiday term, epsilon t is an error term, and a difference term
Figure BDA0003299553440000023
G (t) as input to the Prophet model.
In some embodiments, the prediction modes of the Prophet model include:
predicting the upper and lower boundaries of a predicted value at the t +1 moment by adopting time series data at the t moment, comparing the actual data at the t +1 moment with the upper and lower boundaries of the predicted value at the t +1 moment, and judging that the operation state of the sensing equipment at the t +1 moment is abnormal if the upper and lower boundaries of the predicted value at the t +1 moment are exceeded, or else, judging that the operation state is normal; or
And inputting the time sequence data of the t +1 moment and the time sequence data of the t moment into the Prophet model for retraining, comparing whether a new entry point is an abnormal point, if so, judging that the running state of the sensing equipment at the t +1 moment is abnormal, and otherwise, judging that the running state is normal.
In a second aspect, an embodiment of the present application provides a data analysis apparatus based on multiple types of sensing devices, including:
the data acquisition module is configured to acquire real-time acquired edge calculation data of the sensing equipment;
the preprocessing module is configured to preprocess the edge calculation data to obtain time sequence data;
the fitting module is configured to fit the normal distribution of the time sequence data to obtain a fitting item;
and the model prediction module is configured to input the fitting item into the trained Prophet model, output a predicted value corresponding to future time and upper and lower boundaries of the predicted value, and judge the running state of the sensing equipment according to the actual data and the upper and lower boundaries of the predicted value.
In a third aspect, embodiments of the present application provide an electronic device comprising one or more processors; storage means for storing one or more programs which, when executed by one or more processors, cause the one or more processors to carry out a method as described in any one of the implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, which, when executed by a processor, implements the method as described in any of the implementations of the first aspect.
The invention discloses a data analysis method, a data analysis device and a readable medium based on multi-type sensing equipment, which are applied to an analysis scene of multi-equipment return data of a capital construction field of a present pumped storage power station, and a method system for analyzing and prejudging equipment feedback data is constructed to complete full life cycle monitoring and abnormal detection of any equipment. The processes of edge calculation, data acquisition, data management, model prediction and the like are connected in series, and a standardized flow is provided, so that the method system is reusable and transferable, has good interpretability, completes the standardized management of data through a big data technology, and completes the prediction and monitoring of the equipment state through introducing a Prophet model. The invention has low requirement on data, and can complete the construction of the monitoring link of the whole device only by two dimensions of time and data quantity. The method not only comprises the construction of the model, but also comprises a preprocessing method of the input data of the model and a correction mode after the model is operated. The user can use the simplest way to complete the construction of a generic model and the fastest way to complete the monitoring of the differential devices. The feedback data of the equipment can be utilized to the maximum extent, so that the accuracy of equipment state prediction is further improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is an exemplary device architecture diagram in which one embodiment of the present application may be applied;
FIG. 2 is a flow chart illustrating a multi-type sensing device based data analysis method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a time grid of a multi-type aware device-based data analysis method according to an embodiment of the present invention;
FIG. 4 is a prediction result of a Prophet model of a data analysis method based on multi-type sensing devices according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a multi-type sensing device based data analysis apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a computer device suitable for implementing an electronic apparatus according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 illustrates an exemplary apparatus architecture 100 to which a multi-type aware device-based data analysis method or a multi-type aware device-based data analysis apparatus according to an embodiment of the present application may be applied.
As shown in fig. 1, the apparatus architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various applications, such as data processing type applications, file processing type applications, etc., may be installed on the terminal apparatuses 101, 102, 103.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background data processing server that processes files or data uploaded by the terminal devices 101, 102, 103. The background data processing server can process the acquired file or data to generate a processing result.
It should be noted that the data analysis method based on multiple types of sensing devices provided in the embodiments of the present application may be executed by the server 105, or may also be executed by the terminal devices 101, 102, and 103, and accordingly, the data analysis apparatus based on multiple types of sensing devices may be disposed in the server 105, or may also be disposed in the terminal devices 101, 102, and 103.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In the case where the processed data does not need to be acquired from a remote location, the above device architecture may not include a network, but only a server or a terminal device.
Fig. 2 illustrates a data analysis method based on a multi-type sensing device according to an embodiment of the present application, including the following steps:
and S1, acquiring edge calculation data of the sensing equipment acquired in real time.
Specifically, the real-time acquisition of edge calculation data is completed through the sensing equipment edge calculation module, and data support is provided for the whole data analysis method. The edge computing puts the data processing and the application program running, even the realization of some functional services on the nodes at the edge of the network from the central server, reduces the intermediate transmission process, and has more real-time and faster data processing capability. Because the data exchange with the cloud server is not much, the network bandwidth requirement of edge computing is lower. Selecting proper data input model from edge calculation data of multi-type sensing equipment, wherein the selected data requires data with time attribute of the equipment, usually selecting heartbeat data as a first choice, and the heartbeat data has universality
And S2, preprocessing the edge calculation data to obtain time sequence data.
In a specific embodiment, step S2 specifically includes:
s21, selecting heartbeat data, equipment running long data and equipment abnormal data from the edge calculation data as basic indexes, and combining the basic indexes to obtain index data;
and S22, performing data cleaning on the basic indexes and the index data to obtain time sequence data.
In a specific embodiment, step S21 specifically includes:
and merging the basic indexes by adopting a weighted average mode, wherein the formula is as follows:
Gt=α*Ht+β*Dt+γ*At
wherein HtIs heartbeat data; dtRun for a deviceLong data; a. thetAlpha, beta and gamma are corresponding weighted values, G, respectively, for obtaining equipment stability index according to equipment abnormal datatThe characteristics are weighted for the device at time t.
In one embodiment, the weights α, β, and γ are 0.3, 0.4, and 0.3, respectively, empirically and experimentally. The abnormal data of the equipment can be alarm times and the like. In this case, the time series data completes the selection of the hyperparameter of the subsequent Prophet model according to the past abnormal data expression, so a certain amount of historical state information of the equipment is needed, and in the case of missing such data, for example, a newly added equipment, the initialization can be completed by using the prediction curve of other similar equipment, and then the equipment enters the closed-loop process of the equipment and is continuously corrected according to the feedback data.
In some embodiments, step S22 specifically includes:
defining a grid by using a time window, carrying out comprehensive evaluation on repeated data in the grid by adopting a fuzzy synthesis method, and selecting data with the highest score as a unique result of the repeated data;
filling zero padding to the missing data in the grid;
merging data in a grid based on a time grid, wherein the time grid comprises an hour grid and a day grid, and constructing and converting the time grid into a time sequence data triple (device, time, integrator), wherein the device is a unique identifier of equipment, the time is the time grid, and the integrator is an equipment weighting characteristic G at the time ttObtaining time series data X based on the triplett=f(Gt)。
Specifically, data cleansing includes data deduplication and zero padding. As shown in fig. 3, time windows such as minutes, hours, days, and the like are defined as grids, and for a single grid, problems of data repetition and data loss occur, for example, heartbeat data of more than 1 may occur in any grid in a 1-minute time window, which is heartbeat data repetition, the heartbeat data repetition may be performed by performing comprehensive evaluation on repeated heartbeat data by using a fuzzy synthesis method, and data with the highest score is selected as heartbeat data. If the heartbeat data does not exist in the grid, the heartbeat data is lost data, and the grid is filled in a 0 complementing mode. When data loss causes a responseAnd some data in the next period are null, so the data can be merged according to a certain period, and when no data exists in the period, the data is marked as 0. And merging the grid data to generate an hour grid and a day grid. The merged data is converted into a triple (device, time, accumulator), wherein the device is the unique identifier of the device, the time is the time grid, and the accumulator is the device weighting characteristic G at the time ttWhen the hour grid converges to the sky grid, the new grid still has the data missing problem, the steps are repeated to finish the data preprocessing work, and the time sequence data X is obtainedt=f(Gt). Wherein XtIs the accumulator, i.e. feature G, of each individual device within the time gridtThe characteristic engineering result of (1).
And S3, fitting the normal distribution of the time sequence data to obtain a fitting item.
In a specific embodiment, step S3 specifically includes: and performing binomial difference processing on the time sequence data at the time t and the time t-1 by adopting the following formula to obtain a difference term:
Figure BDA0003299553440000061
wherein G istFor weighting features of the apparatus at time t, Gt-1The device weighting characteristics for time t-1,
Figure BDA0003299553440000062
is a differential term.
Specifically, the time series data of the equipment is processed in a binomial difference mode, and fitting of the equipment data to normal distribution is completed. The difference term can be used as the fitting term. The fitting of the normal distribution is accomplished in this way based on the following two aspects: on the one hand, the problem of the size of the volume of three different data inputs can be balanced to a certain extent, so that the same effect as normalization is achieved. On the other hand, the calculated data input cannot be guaranteed to be completely random, that is, fit with normal distribution. Then there is no way to evaluate the accuracy of the overall data by the probability density formula. And the data can be fitted to a normal distribution curve through differential calculation, so that the rationality of super-parameter selection is ensured to the maximum extent. Abnormal data are screened out in a normal distribution 95% confidence interval mode, and the abnormal data can affect the construction of a model and cause noise interference.
And S4, inputting the fitting items into the trained Prophet model, outputting a predicted value and a predicted value upper and lower boundary corresponding to the future time, and judging the running state of the sensing equipment according to the actual data and the predicted value upper and lower boundaries.
In a specific embodiment, the formula of the Prophet model in step S4 is:
y(t)=g(t)+s(t)+h(t)+εt;
wherein g (t) is a trend term, s (t) is a period term, h (t) is a holiday term, epsilon t is an error term, and a difference term
Figure BDA0003299553440000071
G (t) as input to the Prophet model. g (t) fitting non-periodic variations such as piecewise linear growth or logic growth in the time series; s (t) is a cyclical variation (e.g., seasonal weekly/yearly); h (t) irregular holiday effects (caused by the user); ε t is used to reflect the abnormal changes that are not represented in the model. Inputting time sequence data, periodic attributes, abnormal points and other hyper-parameters to complete the construction of the model. The selection of the hyper-parameters needs to remove some abnormal points according to experience and past data expression, and appoint a period and a holiday for the model.
It should be noted that, in the embodiment of the present invention, storage and calculation capabilities of time series data generated by a sensing device need to be possessed, and prediction and abnormality determination of a Prophet model need to be performed according to corresponding upper and lower boundaries, determination of the boundary needs to be completed by using a hyper-parameter, and determination of the hyper-parameter needs to be completed according to past abnormal data expression, so that a certain amount of historical state information of the device is needed. The training of the Prophet model is a process of adjusting the hyperparameters on the basis of the previous data, and after the hyperparameters are determined, the trained Prophet model is obtained, so that the prediction of the Prophet model can be carried out.
In a specific embodiment, the prediction mode of the Prophet model includes:
predicting the upper and lower boundaries of a predicted value at the t +1 moment by adopting time series data at the t moment, comparing the actual data at the t +1 moment with the upper and lower boundaries of the predicted value at the t +1 moment, and judging that the operation state of the sensing equipment at the t +1 moment is abnormal if the upper and lower boundaries of the predicted value at the t +1 moment are exceeded, or else, judging that the operation state is normal; or
And inputting the time sequence data of the t +1 moment and the time sequence data of the t moment into the Prophet model for retraining, comparing whether a new entry point is an abnormal point, if so, judging that the running state of the sensing equipment at the t +1 moment is abnormal, and otherwise, judging that the running state is normal.
Examples
Taking a monitoring scene of a field display screen device as an example, firstly collecting heartbeat data of the device, then cleaning the heartbeat data of the device, filtering out all data which do not accord with a protocol rule, and removing duplication. The non-repeated data needs to be subjected to grid construction, as shown in fig. 3, minute grids, hour grids and day grids can be constructed, summary statistics is carried out according to the time stamps of the data, the total amount of the grids is consistent with the total time of the natural day when the total amount of the grids is added, data completion is carried out on the empty grids with no data filled with 0, and the extraction work of the heartbeat data index is completed through the method.
Obtaining a differential term through a difference value between t time and t-1 time, enabling the data to accord with a random requirement of normal distribution, then removing abnormal data according to 95% confidence coefficient, wherein the abnormal data are some obvious outlier data which are generated due to equipment debugging, human errors and the like and are abnormal points which can not occasionally reflect real data distribution, thus removing the outlier data, and finally inputting time sequence data into a Prophet model for model construction. The Prophet model constructed at the moment can judge abnormal points of the existing data and give predicted values of upper and lower boundary values of a future time range, but when the future time is prolonged, the boundary is enlarged continuously, which is also determined by the characteristics of the Prophet model, and the model is expressed based on the latest data for the future judgment. Therefore, the Prophet model has two prediction modes, the first mode is that the upper and lower boundaries of a predicted value at the t +1 moment are predicted through a curve obtained by time series data at the t moment, real data are brought into the t +1 moment for comparison, and the prediction mode is abnormal when the predicted value exceeds the boundary; secondly, inputting the time sequence data of the t +1 moment and the time sequence data of the t moment into a Prophet model for retraining, and checking whether a new point is an abnormal point, wherein the new point refers to the latest generated data after the equipment enters a monitoring period. The first mode is more efficient, the updating of the Prophet model can be carried out after the result, and the second mode can provide certain fault tolerance capability when the data is less and can be selected according to the scene. In this embodiment, the hyper-parameters are set as: the predicted sensitivity parameter interval _ width is 0.94; the periodic variation is set to be 1 week, each natural week is a period, and the parameter is weekly _ seamolarity auto; the special date is defined according to actual data, the noise reduction effect can be achieved by removing abnormal values such as holidays, stop days and the like, and the parameter is selected as change _ point [2021-02-05,2021-02-06 ]. The whole prediction result is shown in fig. 4, where yhat is the predicted value, yhat _ lower is the lower boundary of the predicted value, yhat _ upper is the upper boundary of the predicted value, and y is the actual value.
The above is only a simple example, and the sensing device usually has more data under the function, and the weighted summation can be performed to further improve the accuracy of the prediction.
The method provided by the invention is suitable for various devices, only time sequence data and a universal model are needed, the computational power requirement of the model is low, the prediction task can be completed in a very short time, a closed loop can be formed, the model is continuously adjusted in modes of manual feedback, hyper-parameters and the like, and the accuracy of the model is continuously improved. The method has wide application prospect in various fields such as equipment feedback data analysis and the like.
With further reference to fig. 5, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of a data analysis apparatus based on multiple types of sensing devices, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices in particular.
The embodiment of the application provides a data analysis device based on multi-type perception equipment, including:
the data acquisition module 1 is configured to acquire edge calculation data of sensing equipment acquired in real time;
the preprocessing module 2 is configured to preprocess the edge calculation data to obtain time sequence data;
the fitting module 3 is configured to perform fitting of normal distribution on the time sequence data to obtain a fitting item;
and the model prediction module 4 is configured to input the fitting item into the trained Prophet model, output a predicted value corresponding to future time and upper and lower boundaries of the predicted value, and determine the operation state of the sensing equipment according to the actual data and the upper and lower boundaries of the predicted value.
Referring now to fig. 6, a schematic diagram of a computer device 600 suitable for use in implementing an electronic device (e.g., the server or terminal device shown in fig. 1) according to an embodiment of the present application is shown. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer apparatus 600 includes a Central Processing Unit (CPU)601 and a Graphics Processing Unit (GPU)602, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)603 or a program loaded from a storage section 609 into a Random Access Memory (RAM) 604. In the RAM 604, various programs and data necessary for the operation of the apparatus 600 are also stored. The CPU 601, GPU602, ROM 603, and RAM 604 are connected to each other via a bus 605. An input/output (I/O) interface 606 is also connected to bus 605.
The following components are connected to the I/O interface 606: an input portion 607 including a keyboard, a mouse, and the like; an output section 608 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section 609 including a hard disk and the like; and a communication section 610 including a network interface card such as a LAN card, a modem, or the like. The communication section 610 performs communication processing via a network such as the internet. The driver 611 may also be connected to the I/O interface 606 as needed. A removable medium 612 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 611 as necessary, so that a computer program read out therefrom is mounted into the storage section 609 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication section 610, and/or installed from the removable media 612. The computer programs, when executed by a Central Processing Unit (CPU)601 and a Graphics Processor (GPU)602, perform the above-described functions defined in the methods of the present application.
It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable medium or any combination of the two. The computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device, apparatus, or any combination of the foregoing. More specific examples of the computer readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution apparatus, device, or apparatus. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based devices that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present application may be implemented by software or hardware. The modules described may also be provided in a processor.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring edge calculation data of sensing equipment acquired in real time; preprocessing the edge calculation data to obtain time sequence data; fitting normal distribution is carried out on the time sequence data to obtain a fitting item; inputting the fitting items into the trained Prophet model, outputting a predicted value corresponding to future time and upper and lower boundaries of the predicted value, and judging the running state of the sensing equipment according to the actual data and the upper and lower boundaries of the predicted value.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. A data analysis method based on multi-type sensing equipment is characterized by comprising the following steps:
s1, acquiring edge calculation data of the sensing equipment acquired in real time;
s2, preprocessing the edge calculation data to obtain time sequence data;
s3, fitting the time sequence data in normal distribution to obtain a fitting item;
and S4, inputting the fitting items into the trained Prophet model, outputting a predicted value and a predicted value upper and lower boundaries corresponding to future time, and judging the running state of the sensing equipment according to actual data and the predicted value upper and lower boundaries.
2. The multi-type aware-device-based data analysis method according to claim 1, wherein the step S2 specifically comprises:
s21, selecting heartbeat data, equipment running long data and equipment abnormal data from the edge calculation data as basic indexes, and combining the basic indexes to obtain index data;
and S22, performing data cleaning on the basic indexes and the index data to obtain the time sequence data.
3. The multi-type aware-device-based data analysis method according to claim 2, wherein the step S21 specifically comprises:
and merging the basic indexes in a weighted average mode, wherein the formula is as follows:
Gt=α*Ht+β*Dt+γ*At
wherein HtIs heartbeat data; dtRunning long data for the device; a. thetAlpha, beta and gamma are corresponding weighted values, G, respectively, for obtaining equipment stability index according to the equipment abnormal datatThe characteristics are weighted for the device at time t.
4. The multi-type aware-device-based data analysis method according to claim 3, wherein the step S22 specifically comprises:
defining a grid by using a time window, carrying out comprehensive evaluation on repeated data in the grid by adopting a fuzzy synthesis method, and selecting data with the highest score as a unique result of the repeated data;
zero padding is carried out on missing data in the grids;
merging the data in the grids based on a time grid, wherein the time grid comprises an hour grid and a day grid, and constructing and converting the time grid into a time sequence data triple (device, time, accumulator), wherein the device is a unique device identifier, the time is the time grid, and the accumulator is a device weighting characteristic G at the time ttObtaining the time-series data X based on the triplett=f(Gt)。
5. The multi-type aware-device-based data analysis method according to claim 1, wherein the step S3 specifically comprises: and performing binomial difference processing on the time sequence data at the time t and the time t-1 by adopting the following formula to obtain a difference term:
Figure FDA0003299553430000021
wherein G istFor weighting features of the apparatus at time t, Gt-1The device weighting characteristics for time t-1,
Figure FDA0003299553430000022
is a differential term.
6. The multi-type sensing device-based data analysis method according to claim 5, wherein the formula of the Prophet model in step S4 is as follows:
y(t)=g(t)+s(t)+h(t)+εt;
wherein g (t) is a trend term, s (t) is a period term, h (t) is a holiday term, epsilon t is an error term, anddifferential term
Figure FDA0003299553430000023
G (t) as input to the Prophet model.
7. The multi-type aware-device-based data analysis method according to claim 1, wherein the Prophet model predicts a manner comprising:
predicting the upper and lower boundaries of a predicted value at the t +1 moment by adopting the time series data at the t moment, comparing the actual data at the t +1 moment with the upper and lower boundaries of the predicted value at the t +1 moment, and judging that the operation state of the sensing equipment at the t +1 moment is abnormal if the upper and lower boundaries of the predicted value at the t +1 moment exceed the upper and lower boundaries of the predicted value at the t +1 moment, or else, judging that the operation state is normal; or
And inputting the time sequence data of the t +1 moment and the time sequence data of the t moment into the Prophet model for retraining, comparing whether a new entry point is an abnormal point, if so, judging that the operation state of the sensing equipment at the t +1 moment is abnormal, and otherwise, judging that the operation state is normal.
8. A data analysis device based on multi-type perception equipment is characterized by comprising:
the data acquisition module is configured to acquire real-time acquired edge calculation data of the sensing equipment;
the preprocessing module is configured to preprocess the edge calculation data to obtain time sequence data;
the fitting module is configured to fit the time series data in normal distribution to obtain a fitting item;
and the model prediction module is configured to input the fitting item into the trained Prophet model, output a predicted value corresponding to future time and upper and lower boundaries of the predicted value, and judge the running state of the sensing equipment according to actual data and the upper and lower boundaries of the predicted value.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202111186763.6A 2021-10-12 2021-10-12 Data analysis method and device based on multi-type sensing equipment and readable medium Pending CN114118676A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111186763.6A CN114118676A (en) 2021-10-12 2021-10-12 Data analysis method and device based on multi-type sensing equipment and readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111186763.6A CN114118676A (en) 2021-10-12 2021-10-12 Data analysis method and device based on multi-type sensing equipment and readable medium

Publications (1)

Publication Number Publication Date
CN114118676A true CN114118676A (en) 2022-03-01

Family

ID=80441803

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111186763.6A Pending CN114118676A (en) 2021-10-12 2021-10-12 Data analysis method and device based on multi-type sensing equipment and readable medium

Country Status (1)

Country Link
CN (1) CN114118676A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756604A (en) * 2022-06-13 2022-07-15 西南交通大学 Monitoring time sequence data prediction method based on Prophet combination model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756604A (en) * 2022-06-13 2022-07-15 西南交通大学 Monitoring time sequence data prediction method based on Prophet combination model
CN114756604B (en) * 2022-06-13 2022-09-09 西南交通大学 Monitoring time sequence data prediction method based on Prophet combination model

Similar Documents

Publication Publication Date Title
AU2021232839A1 (en) Updating Attribute Data Structures to Indicate Trends in Attribute Data Provided to Automated Modelling Systems
US11693905B2 (en) Chart-based time series regression model user interface
CN115085196B (en) Power load predicted value determination method, device, equipment and computer readable medium
CN111783356B (en) Oil yield prediction method and device based on artificial intelligence
US20210350175A1 (en) Key-value memory network for predicting time-series metrics of target entities
CN112150214A (en) Data prediction method and device and computer readable storage medium
CN117110748A (en) Transformer substation main equipment operation state abnormality detection method based on fusion terminal
CN114118676A (en) Data analysis method and device based on multi-type sensing equipment and readable medium
CN115545331A (en) Control strategy prediction method and device, equipment and storage medium
CN115759413A (en) Meteorological prediction method and device, storage medium and electronic equipment
CN112819260B (en) Data processing system for predicting flight delay state
CN116882597B (en) Virtual power plant control method, device, electronic equipment and readable medium
CN115713044B (en) Method and device for analyzing residual life of electromechanical equipment under multi-condition switching
CN115600770A (en) Wireless signal equipment fault early warning method and system based on time sequence saturation prediction
CN115809818A (en) Multidimensional diagnosis and evaluation method and device for auxiliary equipment of pumped storage power station
Torre et al. Auto-Updated Virtual Multiphase Flow Metering System Using Neural Networks and Edge Computing
CN114971736A (en) Power metering material demand prediction method and device, electronic equipment and storage medium
CN113962795A (en) Bank peer-to-peer current deposit prediction method, device, equipment and storage medium
CN113329128A (en) Traffic data prediction method and device, electronic equipment and storage medium
CN113779103A (en) Method and apparatus for detecting abnormal data
US20240168860A1 (en) Apparatus and method for computer-implemented modeling of multievent processes
CN117332992B (en) Collaborative manufacturing method and system for industrial Internet
WO2023155425A1 (en) Goods transfer method and apparatus, electronic device, and computer-readable medium
CN114792258B (en) Information generation method and device, electronic equipment and computer readable medium
CN115759236B (en) Model training method, information sending method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination