CN116864020B - Data management system applied to EGDA generation process - Google Patents
Data management system applied to EGDA generation process Download PDFInfo
- Publication number
- CN116864020B CN116864020B CN202311132545.3A CN202311132545A CN116864020B CN 116864020 B CN116864020 B CN 116864020B CN 202311132545 A CN202311132545 A CN 202311132545A CN 116864020 B CN116864020 B CN 116864020B
- Authority
- CN
- China
- Prior art keywords
- data
- point
- residual
- points
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- JTXMVXSTHSMVQF-UHFFFAOYSA-N 2-acetyloxyethyl acetate Chemical compound CC(=O)OCCOC(C)=O JTXMVXSTHSMVQF-UHFFFAOYSA-N 0.000 title claims abstract description 45
- 230000008569 process Effects 0.000 title claims abstract description 39
- 238000013523 data management Methods 0.000 title claims abstract description 25
- 238000005886 esterification reaction Methods 0.000 claims abstract description 43
- 238000004220 aggregation Methods 0.000 claims abstract description 29
- 230000002776 aggregation Effects 0.000 claims abstract description 29
- 230000005856 abnormality Effects 0.000 claims abstract description 27
- 238000009826 distribution Methods 0.000 claims abstract description 26
- 230000032050 esterification Effects 0.000 claims abstract description 26
- 230000002159 abnormal effect Effects 0.000 claims abstract description 21
- 238000004519 manufacturing process Methods 0.000 claims abstract description 10
- 238000007405 data analysis Methods 0.000 claims abstract description 4
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 230000008859 change Effects 0.000 claims description 31
- 239000012634 fragment Substances 0.000 claims description 20
- 238000000354 decomposition reaction Methods 0.000 claims description 7
- 239000003550 marker Substances 0.000 claims description 6
- 238000007726 management method Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 abstract description 5
- 238000012545 processing Methods 0.000 abstract description 3
- 230000008030 elimination Effects 0.000 abstract 1
- 238000003379 elimination reaction Methods 0.000 abstract 1
- 238000003672 processing method Methods 0.000 abstract 1
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 9
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 6
- 238000013210 evaluation model Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000012935 Averaging Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 230000001932 seasonal effect Effects 0.000 description 2
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 239000012459 cleaning agent Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- -1 ester compound Chemical class 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 239000000976 ink Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 150000002894 organic compounds Chemical class 0.000 description 1
- 239000003960 organic solvent Substances 0.000 description 1
- 239000003973 paint Substances 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 230000036632 reaction speed Effects 0.000 description 1
- 230000035484 reaction time Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/10—Analysis or design of chemical reactions, syntheses or processes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Computing Systems (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Analytical Chemistry (AREA)
- Health & Medical Sciences (AREA)
- General Factory Administration (AREA)
Abstract
The invention relates to the field of data processing, in particular to a data management system applied to an EGDA generation process, which comprises the following components: the data acquisition and preprocessing module acquires temperature time sequence data and residual time sequence data; the data analysis module obtains the aggregation degree according to the difference between the coordinates of each residual value; acquiring a target point and acquiring each extreme point according to the target point; then obtaining each small sequence segment and corresponding trend characteristic; obtaining characteristic trend distribution degree according to trend characteristics of each small sequence segment; obtaining suspected esterification abnormality degree according to the aggregation degree and the characteristic trend distribution degree; obtaining a marked data point according to the suspected esterification abnormality degree; the data management module acquires the abnormal points, eliminates residual time sequence data according to the abnormal points and the marked data points, and stores and manages the residual time sequence data after elimination. The invention uses the data processing method to make the abnormality detection result more accurate in the EGDA production scene.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a data management system applied to an EGDA generation process.
Background
Ethylene Glycol Diacetate (EGDA) is an organic compound, also known as ethylene glycol diethyl ester. The method is an ester compound prepared by the reaction of ethylene glycol and acetic acid. Ethylene glycol diacetate is commonly used as an organic solvent and has wide application in the fields of paint, ink, adhesive, cleaning agent and the like. It has good solubility and volatility and can dissolve many organic and inorganic substances. In addition, ethylene glycol diacetate is considered a more environmentally friendly alternative in many industrial applications due to its lower toxicity and volatility.
During the production of EGDA, a lot of data need to be collected, such as reaction time, reaction temperature, reactant concentration, catalyst concentration, etc., are generated. Especially, in the preparation process, the control of the reaction temperature is very critical, and the unsuitable reaction temperature not only can slow down the reaction speed, but also can lead to incomplete reaction, reduced product quality, and even has the safety risk caused by temperature runaway. It is therefore important to ensure that the reaction temperature reaches the proper range during the preparation process. In the traditional temperature anomaly detection of EGDA production, STL time sequence decomposition is generally adopted, 3 sigma segmentation is carried out on residual errors obtained by the decomposition, and anomaly data with larger outlier degree is obtained, so that the anomaly detection of the data is realized.
In the process of industrially generating EGDA, the ambient temperature is controlled to be 110-150 ℃, so that the temperature of the EGDA process floats within a certain range, the traditional mode of detecting temperature abnormality in the production process generally adopts STL time sequence decomposition to obtain a residual error term, and the point with overlarge floating amplitude is regarded as an abnormal point, such as sudden drop and sudden rise of the temperature. The direct esterification of ethylene glycol and acetic acid is the most common synthetic method for EGDA and this reaction is an esterification reaction. The esterification reaction is an exothermic reaction, and a large amount of heat is generated by breaking bonds, absorbing heat at the beginning, and then combining into an ester. Therefore, the EGDA generates little heat absorption and a large amount of heat release, and the temperature of the EGDA can be increased or decreased. The conventional STL algorithm captures the heat change generated by the esterification reaction at the moment of EGDA generation as an anomaly, so that the conventional method has limitation on the detection of the scene temperature anomaly. According to the method, the characteristic model of the temperature change of the EGDA is built by combining the temperature change in the current scene, so that a relatively accurate abnormal point detection result is obtained, and accurate monitoring management of temperature data is realized.
Disclosure of Invention
The invention provides a data management system applied to an EGDA generation process, which aims to solve the existing problems.
The data management system applied to the EGDA generation process adopts the following technical scheme:
one embodiment of the present invention provides a data management system applied to an EGDA generation process, the system including the following modules:
the data acquisition and preprocessing module acquires temperature time sequence data in the EGDA generation process, and decomposes and acquires residual time sequence data, wherein the residual time sequence data comprises a plurality of data points, and each data point represents a residual value at each moment;
the data analysis module acquires a time sequence change curve of the residual according to the residual time sequence data, acquires coordinates of each residual value in the curve, and acquires the aggregation degree of each data point in the residual time sequence data according to the difference between the coordinates of each residual value;
acquiring a data point closest to the maximum value according to the time sequence change curve of the residual error, and marking the data point as a target point;
acquiring a plurality of extreme points nearest to a target point, acquiring a plurality of small sequence fragments according to the target point and each extreme point, acquiring trend characteristics of each small sequence fragment, and acquiring characteristic trend distribution degrees of data points in residual time sequence data according to the trend characteristics of all the small sequence fragments;
obtaining suspected esterification abnormality degree of each data point in the residual sequence data according to aggregation degree of each data point in the residual sequence data and characteristic trend distribution degree of the data points in the residual sequence data;
obtaining a marked data point according to the suspected esterification abnormality degree of each data point in the residual sequence data;
and the data management module is used for obtaining abnormal points according to the residual time sequence data, alarming when the data points are both abnormal points and marked data points, and realizing the safety management of the EGDA production process.
Further, the specific acquisition steps of the residual time sequence data are as follows:
and (5) using STL time sequence decomposition to the temperature time sequence data to obtain residual time sequence data.
Further, the specific step of obtaining the aggregation degree of each data point in the residual time sequence data is as follows:
the formula for the aggregation level of each data point in the residual time series data is as follows:
in the method, in the process of the invention,the abscissa and ordinate representing the ith data point in the residual time series data, +.>The abscissa and ordinate representing the (i+1) th data point in the residual time series data, +.>Representing the number of elements in the residual time series data,representing hyperbolic tangent function, ">Indicating the degree of aggregation of the ith data point in the residual sequence data,the representation is centered on the ith data point in the residual sequenceIs the number of data points in the neighborhood of the radius.
Further, the acquiring the plurality of extreme points closest to the target point includes the following specific steps:
the extreme points comprise two minimum value points and two maximum value points;
acquiring a minimum point which is left of the target point and closest to the target point, and marking the minimum point as a first minimum point; acquiring a maximum point which is left of the target point and closest to the target point, and marking the maximum point as a first maximum point; acquiring a minimum point which is right of the target point and closest to the target point, and marking the minimum point as a second minimum point; and acquiring a maximum point which is right to the target point and closest to the target point, and recording the maximum point as a second maximum point.
Further, the obtaining a plurality of small sequence segments according to the target point and each extreme point comprises the following specific steps:
acquiring residual data points from a target point to a first minimum point, and recording the residual data points as small sequence fragmentsThe method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points from a target point to a second minimum point, and marking the residual data points as a small sequence segment +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points from the first minimum value point to the first maximum value point, and marking the residual data points as small sequence fragments +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points from the second minimum value point to the second maximum value point, and marking the residual data points as small sequence fragments +.>。
Further, the specific acquisition steps of the trend characteristic of each small sequence segment are as follows:
for any small sequence segment, firstly acquiring slopes between adjacent data points in the small sequence segment, forming a slope sequence by slopes between all adjacent data points, acquiring differences between adjacent slopes in the slope sequence, marking the differences as first differences, and adding all the first differences to obtain trend characteristics of each small sequence segment; wherein the difference represents the absolute value of the difference.
Further, the specific step of obtaining the characteristic trend distribution degree of the data points in the residual time sequence data is as follows:
the formula of the characteristic trend distribution degree of the data points in the residual time sequence data is as follows:
in the method, in the process of the invention,representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in the data set,representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->Representation of smallSequence fragment->Trend characteristics of->And the characteristic trend distribution degree of the data points in the residual time sequence data is represented.
Further, the specific acquisition steps of the suspected esterification abnormality degree of each data point in the residual sequence data are as follows:
the formula of the suspected esterification abnormality degree of each data point in the residual sequence data is as follows:
in the method, in the process of the invention,indicating the degree of aggregation of the ith data point in the residual sequence data,/->Characteristic trend distribution degree of data points in residual time series data, < >>And->For a preset threshold value, ++>Represents an exponential function based on natural constants, < ->And (5) representing suspected esterification abnormality degree of the ith data point in the residual sequence data.
Further, the specific acquisition steps of the marker data points are as follows:
and marking residual data points with suspected esterification abnormality degrees higher than the classification threshold Y as marker data points.
Further, the specific acquisition steps of the abnormal points are as follows:
obtaining the average value of the residual time sequence data according to all the data in the residual time sequence data, and marking the average value asObtaining standard deviation of residual time sequence data according to all data in the residual time sequence data, and marking the standard deviation as +.>Acquiring the data value in the residual time sequence dataAnd->Data within the range and is noted as outliers.
The technical scheme of the invention has the beneficial effects that: the characteristic of the esterification reaction temperature change is utilized to obtain change data points caused by the esterification reaction temperature change, the change data points are reflected as deviation points in residual errors, and the temperature outlier points caused by real anomaly reasons are obtained by eliminating the data points.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block flow diagram of a data management system of the present invention as applied to an EGDA generation process.
Detailed Description
In order to further describe the technical means and effects adopted by the present invention to achieve the preset purposes, the following detailed description refers to the specific implementation, structure, characteristics and effects of the data management system for the EGDA generating process according to the present invention with reference to the accompanying drawings and the preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of the data management system applied to the EGDA generating process provided by the present invention with reference to the accompanying drawings.
Referring to fig. 1, a block flow diagram of a data management system applied to an EGDA generation process according to one embodiment of the present invention is shown, where the system includes the following blocks:
module 101: and the data acquisition and preprocessing module.
In addition, since the temperature is controlled within a certain range during the production of EGDA, the control of the temperature is very critical, and thus the present embodiment analyzes the change of the temperature during the production of EGDA.
Specifically, temperature time series data in one day are collected by a temperature monitor of the industrial EGDA, time in seconds is taken as a time horizontal axis, and a coordinate system is constructed by taking the temperature vertical axis starting from 100 in the temperature. And obtaining a time sequence change curve of the temperature data in a coordinate system through the temperature time sequence data. STL time sequence decomposition is carried out on a time sequence change curve of temperature data, and the time sequence change curve can be decomposed into three items, wherein the three items comprise: trend item time sequence data, season item time sequence data and residual time sequence data, wherein the residual time sequence data comprises a plurality of data points, and each data point represents a residual value at each moment.
So far, residual time sequence data are obtained.
Module 102: and a data analysis module.
It should be noted that, the environmental temperature data in the EGDA generating process is time sequence data fluctuating up and down, and the residual error obtained through STL time sequence decomposition is represented on a coordinate axis, that is, most of the residual error is aggregated near a central axis, and the degree of abnormality of the data is usually smaller when the data is closer to the central axis. But this simply separates the outliers by distance from the outlier level and does not take into account the outlier of the data points due to the esterification reaction in the current scenario. Therefore, the data points of abnormal points and suspected esterification reaction are classified from characteristics, a suspected esterification abnormality degree evaluation model is constructed by combining the aggregation space characteristics formed by the density between points of temperature data residual errors of the esterification reaction and the trend characteristics of point arrangement, residual data points are distinguished and classified, abnormal residual error points caused by the esterification reaction are obtained, and the abnormal residual error points are removed from the final analysis processing of residual error constant points.
(1) And acquiring the aggregation degree of each data point in the residual time sequence data.
It should be further noted that, the residual time sequence change curve may be obtained from the residual time sequence data. The aggregation degree of the residual data points is the aggregation characteristic of the data in the curve image plane space, so that the density aggregation condition of the data is analyzed. Partial ideas of DB-SCAN density clustering are adopted to obtain data pointsSearch in the circular area for radius, +.>The data separation distance, i.e., the average density, is typically averaged. The variation of the abnormal data points in the same unit time is mostly shown as sudden variation and large outlier degree, so that only one point or even no point exists in the radius range around the abnormal point; the esterification reaction is a chemical reaction process, and the change form of the data points in the same time unit is stable and uniform, so that two to three points are required to exist in the radius range around the point. The probability of more adjacent points in the radius of the circular area, which is an inflection point, in the time sequence change curve is higher, the aggregation degree is higher, and the data points which are misjudged to be abnormal points are usually allSince the inflection point is located at the highest point or the lowest point, the likelihood of the suspected esterification abnormality corresponding to the inflection point is increased.
Specifically, a time sequence change curve of the residual is obtained by fitting the residual time sequence data, coordinates of each residual value in the curve are obtained, and the aggregation degree of data points in the residual time sequence is obtained according to the difference between the coordinates of each residual value. The formulation is as follows:
in the method, in the process of the invention,the abscissa and ordinate representing the ith data point in the residual time series data, +.>The abscissa and ordinate representing the (i+1) th data point in the residual time series data, +.>Representing the number of elements in the residual time series data,representing hyperbolic tangent function, ">Indicating the degree of aggregation of the ith data point in the residual sequence data,the representation is centered on the ith data point in the residual sequenceIs the number of data points in the neighborhood of the radius.
Wherein, the liquid crystal display device comprises a liquid crystal display device,is the Euclidean distance between the (i+1) th data point and the (i) th data point, and represents two coordinate pointsDirect distance between (I)>Is the sum of the distances between n data points, +.>Is the average of the distances between n adjacent data points, i.e. the radius of the circle field +.>。/>Representing the data point xSearching is performed in a circular neighborhood of radius, and the result is the number of adjacent data points present in the neighborhood. />The first 0.5 is for controlling the value +.>Within the optimum range of>The value obtained +.>1, tan h final normalized result value +.>0.5, let the final +.>The normalization effect is better. The degree of aggregation of each data point in the residual is thus obtained.
Note that the time series change curve in this embodiment is obtained by fitting according to a 5 th degree polynomial using a least square method.
Thus, the aggregation degree of each data point in the residual time sequence data is obtained.
(2) And obtaining the characteristic trend distribution degree of the data points in the residual time sequence data.
The trend distribution characteristics are analyzed by continuously adopting the time sequence variation curve of the residual error. Each cycle of the seasonal term separated from the original temperature data sequence is represented in a form of short frequency and small amplitude in the graph, while the exothermic process of the esterification reaction generated by the mass production is smooth and uniform, and is represented in a form of long frequency and large amplitude in the temperature data sequence, so that the residual form obtained by subtracting the seasonal term from the temperature data of the part of the esterification reaction of the original sequence can be deduced as follows: along with the time sequence, the device gradually moves away from the central axis and returns to the central axis, and gradually returns to the central axis according to the rule when the device reaches the furthest point.
Further, since the points of the esterification reaction detected as abnormal points are usually located at the top of the temperature change curve, the trend characteristic is detected from the top. Firstly, finding all vertexes in a residual error item, wherein the vertexes are the maximum value on a time sequence change curve of the residual error; the difference of the slope changes of the data on the left and the right sides is not large from the top point, the slope changes between the following data points are still not large after the data reaches the valley point after the first large slope changes. The total regular trend number is uncertain, and only two curves around the peak point are taken to indicate that the peak point is the peak point of the temperature change curve of the esterification reaction. The integral trend change degree of the curve is reflected by the accumulated sum and the average value of slope difference values between every two points of the curve, and if the obtained average value is smaller, the change degree of the curve is small, and the curve is a smooth trend distribution curve. Thereby obtaining the trend distribution degree capable of reflecting the trend characteristic of the esterification reaction temperature change data.
Specifically, acquiring one data point closest to the maximum value according to a time sequence change curve of the residual error, and marking the data point as a target point; when the target point is at the leftmost side and the rightmost side, in order to meet the requirement that two extreme points exist at two sides in the follow-up, one data point closest to the secondary maximum value can be selected as the target point, if two extreme points exist at two sides, one data point closest to the tertiary maximum value is continuously selected as the target point, and the process is repeated until two extreme points exist at two sides of the target point, and then the target point selection is completed.
And obtaining the characteristic trend distribution degree of the data points in the residual time sequence according to the difference value of the slopes of the data points in the residual time sequence.
Firstly, all maximum points and minimum points are obtained according to a time sequence change curve of a residual, and because a target point is one maximum value in the time sequence change curve of the residual, one minimum point which is left of the target point and is closest to the target point is obtained and is recorded as a first minimum point; acquiring a maximum point which is left of the target point and closest to the target point, and marking the maximum point as a first maximum point; acquiring a minimum point which is right of the target point and closest to the target point, and marking the minimum point as a second minimum point; and acquiring a maximum point which is right to the target point and closest to the target point, and recording the maximum point as a second maximum point.
Acquiring residual data points (including the target point and the first minimum point) between the target point and the first minimum point, and recording the residual data points as small sequence fragmentsSmall sequence fragment->The number of data in (a) is recorded as +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points (comprising the target point and the second minimum point) from the target point to the second minimum point, and recording the residual data points as a small sequence segment +.>Small sequence fragment->The number of data in (a) is recorded as +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring a first minimum value point to a first maximum valueResidual data points between points (including the first maximum point but not including the first minimum point) noted as small sequence fragment +.>Small sequence fragment->The number of data in (a) is recorded as +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points (including the second maximum point but not including the second minimum point) between the second minimum point and the second maximum point, and recording the residual data points as small sequence fragments +.>Small sequence fragment->The number of data in (a) is recorded as +.>。
Then calculate the small sequence fragment againThe slope between adjacent data points of the series is obtained by obtaining the small series segment +.>Is specifically expressed as:
similarly, small sequence fragments are obtainedIs specifically expressed as:
similarly, small sequence fragments are obtainedIs specifically expressed as:
similarly, small sequence fragments are obtainedIs specifically expressed as:
in the method, in the process of the invention,represents the i-th slope between data points, +.>Represents the i+1th slope between data points, +.>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->Is characterized by the trend of (a),representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of (2).
Based on small sequence fragmentsTrend characteristic of (2) small sequence fragment->Trend characteristic of (2) small sequence fragment->Trend characteristics and small sequence fragment->The trend feature of the data points in the residual time sequence data is obtained, and the feature trend distribution degree of the data points in the residual time sequence data is specifically expressed as follows by a formula:
in the method, in the process of the invention,representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in the data set,representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->And the characteristic trend distribution degree of the data points in the residual time sequence data is represented. And carrying out accumulation and averaging on the slope differences obtained in each section, wherein the averaging can reduce the result along with the increase of the same trend data points, thereby expanding the characteristics of the esterification reaction temperature change curve.
And thus, obtaining the characteristic trend distribution degree of the data points in the residual time sequence data.
(3) And constructing a suspected esterification abnormality degree evaluation model of the data according to the aggregation degree of the data points in the residual time sequence and the characteristic trend distribution degree.
Specifically, a threshold value is presetAnd->Wherein the present embodiment is +.>And->To describe the example, the present embodiment is not particularly limited, wherein +.>And->Depending on the particular implementation. And weighting the aggregation degree and the characteristic trend distribution degree, and constructing a suspected esterification abnormality degree evaluation model of the data.
In the method, in the process of the invention,indicating the degree of aggregation of the ith data point in the residual sequence data,/->Characteristic trend distribution degree of data points in residual time series data, < >>And->For a preset threshold value, ++>Represents an exponential function based on natural constants, < ->And (5) representing suspected esterification abnormality degree of the ith data point in the residual sequence data.
Thus, the suspected esterification abnormality degree of each data point in the residual sequence is obtained.
(4) And classifying the residual time sequence data by a suspected esterification abnormality degree evaluation model.
After normalization treatment, when the aggregation degree D is more than or equal to 0.5, the aggregation density around the data point is larger, and the suspected esterification abnormality degree of the representative data is higher. Degree of characteristic trend distribution T1, a curve with a change in the temperature of the esterification reaction can be basically reflected, corresponding +.>Thus, the final classification threshold Y is 0.44 by weight matching. The residual data points with suspected esterification abnormality degree higher than 0.44 are marked as mark data points; residual data points with suspected esterification anomaly level less than or equal to 0.44 are noted as non-marker data points.
Module 103: and a data management module.
Obtaining the average value of the residual time sequence data according to all the data in the residual time sequence data, and marking the average value asObtaining standard deviation of residual time sequence data according to all data in the residual time sequence data, and marking the standard deviation as +.>Acquiring the data value in the residual time sequence dataAnd->And (3) marking the data in the range as abnormal points, eliminating residual data points with suspected esterification abnormality degree higher than 0.44 from all abnormal points, and when the data points are both marker data points and abnormal points, indicating that the temperature corresponding to the data points is abnormal, indicating that the abnormal conditions exist in the process of generating the EGDA, and warning is needed at the moment, so that the safety management of the production process of the EGDA is realized.
This embodiment is completed.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.
Claims (10)
1. A data management system for use in an EGDA generation process, the system comprising:
the data acquisition and preprocessing module acquires temperature time sequence data in the EGDA generation process, and decomposes and acquires residual time sequence data, wherein the residual time sequence data comprises a plurality of data points, and each data point represents a residual value at each moment;
the data analysis module acquires a time sequence change curve of the residual according to the residual time sequence data, acquires coordinates of each residual value in the curve, and acquires the aggregation degree of each data point in the residual time sequence data according to the difference between the coordinates of each residual value;
acquiring a data point closest to the maximum value according to the time sequence change curve of the residual error, and marking the data point as a target point;
acquiring a plurality of extreme points nearest to a target point, acquiring a plurality of small sequence fragments according to the target point and each extreme point, acquiring trend characteristics of each small sequence fragment, and acquiring characteristic trend distribution degrees of data points in residual time sequence data according to the trend characteristics of all the small sequence fragments;
obtaining suspected esterification abnormality degree of each data point in the residual sequence data according to aggregation degree of each data point in the residual sequence data and characteristic trend distribution degree of the data points in the residual sequence data;
obtaining a marked data point according to the suspected esterification abnormality degree of each data point in the residual sequence data;
and the data management module is used for obtaining abnormal points according to the residual time sequence data, alarming when the data points are both abnormal points and marked data points, and realizing the safety management of the EGDA production process.
2. The data management system applied to the EGDA generating process according to claim 1, wherein the specific acquisition steps of the residual time series data are as follows:
and (5) using STL time sequence decomposition to the temperature time sequence data to obtain residual time sequence data.
3. The data management system applied to the EGDA generating process according to claim 1, wherein the specific obtaining step of the aggregation degree of each data point in the residual time series data is as follows:
the formula for the aggregation level of each data point in the residual time series data is as follows:
in the method, in the process of the invention,the abscissa and ordinate representing the ith data point in the residual time series data, +.>The abscissa and ordinate representing the (i+1) th data point in the residual time series data, +.>Representing the number of elements in the residual time series data, +.>Representing hyperbolic tangent function, ">Indicating the degree of aggregation of the ith data point in the residual sequence data,the representation is centered on the ith data point in the residual sequenceIs the number of data points in the neighborhood of the radius.
4. The data management system applied to the EGDA generating process according to claim 1, wherein the acquiring the plurality of extreme points closest to the target point comprises the following specific steps:
the extreme points comprise two minimum value points and two maximum value points;
acquiring a minimum point which is left of the target point and closest to the target point, and marking the minimum point as a first minimum point; acquiring a maximum point which is left of the target point and closest to the target point, and marking the maximum point as a first maximum point; acquiring a minimum point which is right of the target point and closest to the target point, and marking the minimum point as a second minimum point; and acquiring a maximum point which is right to the target point and closest to the target point, and recording the maximum point as a second maximum point.
5. The data management system applied to the EGDA generating process according to claim 4, wherein the obtaining a plurality of small sequence segments according to the target point and each extreme point comprises the following specific steps:
acquiring residual data points from a target point to a first minimum point, and recording the residual data points as small sequence fragmentsThe method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points from a target point to a second minimum point, and marking the residual data points as a small sequence segment +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points from the first minimum value point to the first maximum value point, and marking the residual data points as small sequence fragments +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring residual data points from the second minimum value point to the second maximum value point, and marking the residual data points as small sequence fragments +.>。
6. The data management system for use in an EGDA generation process according to claim 1, wherein the specific acquisition steps of the trend feature of each small sequence segment are as follows:
for any small sequence segment, firstly acquiring slopes between adjacent data points in the small sequence segment, forming a slope sequence by slopes between all adjacent data points, acquiring differences between adjacent slopes in the slope sequence, marking the differences as first differences, and adding all the first differences to obtain trend characteristics of each small sequence segment; wherein the difference represents the absolute value of the difference.
7. The data management system applied to the EGDA generating process according to claim 5, wherein the specific obtaining step of the characteristic trend distribution degree of the data points in the residual time series data is as follows:
the formula of the characteristic trend distribution degree of the data points in the residual time sequence data is as follows:
in the method, in the process of the invention,representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->The number of data in>Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Trend characteristics of->Representing a small sequence fragment->Is characterized by the trend of (a),representing a small sequence fragment->Trend characteristics of->And the characteristic trend distribution degree of the data points in the residual time sequence data is represented.
8. The data management system applied to the EGDA generating process according to claim 1, wherein the specific obtaining step of the suspected esterification abnormality degree of each data point in the residual sequence data is as follows:
the formula of the suspected esterification abnormality degree of each data point in the residual sequence data is as follows:
in the method, in the process of the invention,indicating the degree of aggregation of the ith data point in the residual sequence data,/->Characteristic trend distribution degree of data points in residual time series data, < >>And->For a preset threshold value, ++>Represents an exponential function based on natural constants, < ->And (5) representing suspected esterification abnormality degree of the ith data point in the residual sequence data.
9. The data management system for use in an EGDA generation process according to claim 1, wherein the specific acquisition steps of the marker data points are as follows:
and marking residual data points with suspected esterification abnormality degrees higher than the classification threshold Y as marker data points.
10. The data management system applied to the EGDA generation process according to claim 1, wherein the specific acquisition steps of the outlier are as follows:
obtaining the average value of the residual time sequence data according to all the data in the residual time sequence data, and marking the average value asObtaining standard deviation of residual time sequence data according to all data in the residual time sequence data, and marking the standard deviation as +.>Acquiring the data value in the residual time sequence dataAnd->Data within the range and is noted as outliers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311132545.3A CN116864020B (en) | 2023-09-05 | 2023-09-05 | Data management system applied to EGDA generation process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311132545.3A CN116864020B (en) | 2023-09-05 | 2023-09-05 | Data management system applied to EGDA generation process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116864020A CN116864020A (en) | 2023-10-10 |
CN116864020B true CN116864020B (en) | 2023-11-03 |
Family
ID=88219415
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311132545.3A Active CN116864020B (en) | 2023-09-05 | 2023-09-05 | Data management system applied to EGDA generation process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116864020B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117556386B (en) * | 2024-01-12 | 2024-04-05 | 济南海德热工有限公司 | Temperature data monitoring method in furnace refining process |
CN117648590B (en) * | 2024-01-30 | 2024-04-19 | 山东万洋石油科技有限公司 | Omnibearing gamma logging data optimization processing method |
CN117931094A (en) * | 2024-03-21 | 2024-04-26 | 山东奥斯瑞特检验检测有限公司 | Block chain-based reliable storage method for ambient air monitoring data |
CN117928655A (en) * | 2024-03-22 | 2024-04-26 | 济宁万生环保材料有限公司 | Material reaction instant acid value data on-line monitoring system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011140433A2 (en) * | 2010-05-07 | 2011-11-10 | The Board Of Trustees Of The Leland Stanford Junior University | Measurement and comparison of immune diversity by high-throughput sequencing |
WO2020019403A1 (en) * | 2018-07-26 | 2020-01-30 | 平安科技(深圳)有限公司 | Electricity consumption abnormality detection method, apparatus and device, and readable storage medium |
CN111931872A (en) * | 2020-09-27 | 2020-11-13 | 北京工业大数据创新中心有限公司 | Method and device for determining abnormity of trend symptom |
CN115564021A (en) * | 2022-09-23 | 2023-01-03 | 东华大学 | Fault root cause sequencing method in polyester fiber polymerization process |
CN116090916A (en) * | 2023-04-10 | 2023-05-09 | 淄博海草软件服务有限公司 | Early warning system for enterprise internal purchase fund accounting |
-
2023
- 2023-09-05 CN CN202311132545.3A patent/CN116864020B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011140433A2 (en) * | 2010-05-07 | 2011-11-10 | The Board Of Trustees Of The Leland Stanford Junior University | Measurement and comparison of immune diversity by high-throughput sequencing |
WO2020019403A1 (en) * | 2018-07-26 | 2020-01-30 | 平安科技(深圳)有限公司 | Electricity consumption abnormality detection method, apparatus and device, and readable storage medium |
CN111931872A (en) * | 2020-09-27 | 2020-11-13 | 北京工业大数据创新中心有限公司 | Method and device for determining abnormity of trend symptom |
CN115564021A (en) * | 2022-09-23 | 2023-01-03 | 东华大学 | Fault root cause sequencing method in polyester fiber polymerization process |
CN116090916A (en) * | 2023-04-10 | 2023-05-09 | 淄博海草软件服务有限公司 | Early warning system for enterprise internal purchase fund accounting |
Non-Patent Citations (2)
Title |
---|
基于改进时间序列模型的日志异常检测方法;陆佳丽;;信息网络安全(第09期);全文 * |
茶油脂肪酸组分近红外模型构建研究;何小三;李博;符树根;贺义昌;曹冰;付宇新;刘易鑫;华小菊;雷小林;;南方林业科学(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116864020A (en) | 2023-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116864020B (en) | Data management system applied to EGDA generation process | |
Jiang et al. | To trust or not to trust a classifier | |
Ma et al. | Supervised anomaly detection in uncertain pseudoperiodic data streams | |
CN105957076B (en) | A kind of point cloud segmentation method and system based on cluster | |
Atoum et al. | Automatic feeding control for dense aquaculture fish tanks | |
US8724904B2 (en) | Anomaly detection in images and videos | |
US8189906B2 (en) | Information processing apparatus and method | |
CN114926356A (en) | LiDAR point cloud unsupervised denoising method aiming at snowfall influence | |
CN112967320B (en) | Ship target detection tracking method based on bridge anti-collision | |
Wagner et al. | Modifications of the OPTICS clustering algorithm for short-range radar tracking applications | |
CN108665479B (en) | Infrared target tracking method based on compressed domain multi-scale feature TLD | |
CN111523576B (en) | Density peak clustering outlier detection method suitable for electron quality detection | |
CN114594454A (en) | Small-spot photon counting laser radar water surface detection method based on multistage filtering and program product | |
CN110660028B (en) | Small target detection method based on joint edge filtering morphology | |
Sun et al. | Object Detection with Geometrical Context Feedback Loop. | |
Nowak et al. | Vehicle categorization: Parts for speed and accuracy | |
CN116720753B (en) | Hydrologic data processing method, hydrologic data processing system and readable storage medium | |
CN112381051B (en) | Plant leaf classification method and system based on improved support vector machine kernel function | |
Cheng et al. | Universal adversarial attack against 3D object tracking | |
Thota et al. | Classify vehicles: Classification or clusterization? | |
Bharti | Estimating road-user position from a camera: a machine learning approach to enable safety applications | |
Cai et al. | Real-Time Detection of Linear Structure Objects Using Mean Shift Segmentation and Heuristic Search | |
Zhang et al. | An Outlier Detection Algorithm Based on Local Density Feedback Outlier Factor | |
Yao et al. | A DCNN-based Arbitrarily-Oriented Object Detector for Quality Control and Inspection Application | |
Zheng et al. | Abnormal temperature detection of blower components based on infrared video images analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |