CN118094107B - Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge - Google Patents

Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge Download PDF

Info

Publication number
CN118094107B
CN118094107B CN202410411049.XA CN202410411049A CN118094107B CN 118094107 B CN118094107 B CN 118094107B CN 202410411049 A CN202410411049 A CN 202410411049A CN 118094107 B CN118094107 B CN 118094107B
Authority
CN
China
Prior art keywords
data
detection
abnormal
point
anomaly
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410411049.XA
Other languages
Chinese (zh)
Other versions
CN118094107A (en
Inventor
曲海波
赵永峰
赵楠楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hualixing Sci Tech Development Co Ltd
Original Assignee
Beijing Hualixing Sci Tech Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Hualixing Sci Tech Development Co Ltd filed Critical Beijing Hualixing Sci Tech Development Co Ltd
Priority to CN202410411049.XA priority Critical patent/CN118094107B/en
Publication of CN118094107A publication Critical patent/CN118094107A/en
Application granted granted Critical
Publication of CN118094107B publication Critical patent/CN118094107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Length-Measuring Devices Using Wave Or Particle Radiation (AREA)

Abstract

The invention provides an abnormal data detection method, an abnormal data diagnostic device and a ray thickness gauge, wherein the abnormal data detection method comprises the following steps: collecting detection data from an X-ray thickness measuring device in real time; detecting the abnormality of the detection data, and finding out information of the abnormality detection data; storing an abnormality detection data message; outputting abnormality detection data information; a diagnostic report is generated, the diagnostic report including detailed results of the anomaly detection and the fault cause analysis. The method automatically identifies the abnormality from the characteristic data, and improves the sensitivity and accuracy to the abnormal condition; the data does not need to be assumed to follow a certain specific distribution in advance, so that the method is more flexible in changeable practical application; identifying anomalies by evaluating how easily data points are isolated, and adapting to various data pattern changes more dynamically; the X-ray thickness measuring device can realize real-time monitoring and rapid fault diagnosis, discover potential faults of equipment in time, help technicians analyze fault reasons, optimize maintenance strategies and effectively improve the reliability and maintenance efficiency of the equipment.

Description

Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge
Technical Field
The invention relates to the technical field of thickness measurement of a plate and strip production line; more specifically, the present invention relates to an abnormal data detection method, an abnormal data diagnostic apparatus, and a radiation thickness gauge.
Background
In the technical field of plate and strip production lines such as steel plates, X-ray thickness measurement is a key technology for accurately measuring the thickness of a plate and strip. The accuracy of the X-ray thickness measurement is critical to ensure product quality and production efficiency.
Currently, with the development of plate and strip production technology and the increase in production requirements, existing X-ray thickness measurement systems are facing new challenges and new requirements. An X-ray thickness measurement system using a precursor server disclosed in patent document No. CN111051812B proposes an abnormal state diagnosis method based on precursor data generation by collecting measurement information including parameters such as a driving voltage value, a driving current value, a tube voltage value, a tube current value, a detection signal, and a plate and strip thickness, and further generating two parts of precursor data. The first part calculates the standard deviation of the measurement information and compares the standard deviation with a preset first threshold value; the second part extracts the detection signal from the measurement information to calculate the product of the variance and kurtosis, and compares the product with a preset second threshold. The method aims at analyzing the measurement information by using a statistical method to generate useful precursor data so as to diagnose the abnormal state of the equipment.
However, although the existing method can diagnose the abnormal state of the device to some extent, the method still has several limitations and problems such as:
1. The dependent statistical tools are relatively basic, single, and cannot capture complex patterns or hidden associations in the data.
2. The thresholds used for the comparison are static, fixed, lacking dynamic and adaptive capabilities, and are not suitable for use in a rapidly changing data environment.
3. The analysis method is mainly used for descriptive statistical analysis, lacks predictive analysis capability, and has low efficiency and limited accuracy when processing a large amount of or high-dimensional data.
In summary, the existing X-ray thickness measurement system and the abnormality diagnosis method still have obvious limitations and disadvantages in terms of processing complex data, adapting to new situations, providing deep insight and prediction, and the like.
Therefore, there is an urgent need to develop a more efficient and intelligent anomaly detection algorithm and anomaly data diagnostic system to improve the reliability and maintenance efficiency of the radiation thickness gauge and associated equipment.
Disclosure of Invention
In view of the above, an object of the present invention is to propose an efficient anomaly detection algorithm and a diagnostic device for executing the anomaly detection algorithm, and a radiation thickness gauge using the diagnostic device, focusing on identifying anomaly data in data without depending on a preset statistical threshold, so as to improve sensitivity and accuracy to anomaly; the data does not need to be assumed to follow a certain specific distribution in advance, so that the method is more flexible in changeable practical application; identifying anomalies by evaluating how easily data points are isolated, more dynamic, to accommodate changes in various data patterns; real-time monitoring and rapid fault diagnosis of the X-ray thickness measuring device are realized, so that equipment faults or performance degradation can be found in advance, technicians are helped to analyze fault reasons, maintenance strategies are optimized, and reliability and maintenance efficiency of equipment are improved.
The invention provides an abnormal data detection method, which comprises the following steps:
S1, acquiring detection data from an X-ray thickness measuring device in real time; the detection data comprises at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness, a temperature and related time data;
S2, carrying out anomaly detection on the detection data, and finding out anomaly detection data information; the method for detecting the abnormality of the detection data comprises the following steps:
s21, carrying out data preprocessing on the detection data;
s22, performing anomaly detection on the preprocessed data, wherein the anomaly detection comprises the following steps of:
S221, constructing a detection data set; the data characteristics of the detection data set comprise at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness and a temperature;
S222, randomly selecting one feature from all features of the detection data set; randomly selecting a segmentation point on the selected feature; the data set is segmented into two subsets according to the selected features and segmentation points: subset 1, subset 2;
for example, a cut point is randomly selected on a selected feature (e.g., a "tube voltage value") with a range of cuts between a minimum value and a maximum value for the feature; for example, if the "tube voltage value" is in the range of 80kV to 150kV, it is possible to randomly select 120kV as the dividing point;
in an embodiment of the invention, subset 1 contains all data points with values less than or equal to the cut-off point (120 kV) on the selected feature ("tube voltage value");
Subset 2: all data points contained on the selected feature ("tube voltage value") with values greater than the cut-off point (120 kV);
S223, regarding each of the two subsets obtained by segmentation as a new data set, repeating the random segmentation process of the step S222 on each subset for iterative segmentation, and carrying out random segmentation on the data set by recursion in the iterative process to isolate abnormal points step by step;
Specifically, one feature is randomly selected from all features of the subset; randomly selecting a segmentation point on the selected feature; dividing the subset into two smaller subsets according to the selected features and the dividing points;
the segmentation process is performed recursively, and more subsets are generated for each segmentation to form a multi-level segmentation structure.
Each segmentation aims to further isolate data points, especially outliers. By repeated random partitioning, outliers can be more quickly isolated into small subsets because their values on certain features differ significantly from common points.
For example, assume that a "tube voltage value" feature is selected at the time of the first division, and 120kV is taken as a division point. On the resulting subset, the "current sense signal" feature is selected for the second segmentation. If an outlier has a significantly different value on both features than the normal point, it is likely to have been isolated to a small subset containing only itself after two divisions.
In the iterative process of the step S223, the method for gradually isolating the outliers by recursively randomly dividing the data set includes:
Creating a decision tree, wherein each partition corresponds to an internal node of the decision tree until the data points are completely isolated (i.e., cannot be further partitioned) or the maximum depth of the decision tree is reached;
For example, assume that the "tube voltage value" feature is selected for a certain division, and the division point is 120kV. The corresponding node then stores the information of this feature and the cut point. All data points with the tube voltage value of less than or equal to 120kV are divided into a left subtree, and data points with the tube voltage value of more than 120kV are divided into a right subtree;
S224, judging the abnormality degree of each data point through a decision tree; for each data point, defining the number of edges from the root node to the leaf node where the data point is located as the path length of the data point; calculating the anomaly score of each data point based on a scoring mechanism of the path length, and taking the anomaly score as a basis for judging whether the data point is anomaly;
According to the path length, calculating the anomaly score of each data point, wherein the anomaly score is calculated according to the following formula:
wherein E (h (x)) is a data point;
x is the average path length of all trees;
c (n) is a normalization factor for the average unsuccessful search depth given the number of data points n;
Is a normalized path length such that the anomaly score is between 0 and 1;
Outliers are typically more easily isolated and therefore have shorter path lengths in the decision tree;
If the anomaly score of a data point is close to 1, the average path length of the data point is short, and the high probability is an anomaly point; if the anomaly score of a data point is close to 0, the average path length of the data point is long, and the high probability is a common point;
s225, according to the scoring condition of the anomaly score obtained through calculation, identifying the data points with scores lower than a set threshold as anomaly data points; performing pattern analysis on the identified abnormal data points; deducing the fault reason according to the abnormal mode and the operation rule of the equipment;
Through the detailed analysis flow, abnormal data in the X-ray thickness measuring device can be effectively identified, and accurate references are provided for subsequent maintenance and fault elimination;
S3, storing abnormal detection data information, so that historical data can be traced and further analyzed conveniently;
S4, outputting abnormal detection data information; a diagnostic report is generated that includes detailed results of the anomaly detection and the fault cause analysis. The diagnostic report may be presented visually through an interface or may be exported as a document for further analysis by a technician.
Further, the data preprocessing in the step S21 includes: data cleaning and data standardization;
Wherein the data cleansing comprises: defining abnormal values, strategies for processing missing values, processing error data, processing repeated data and converting data; checking the detected data to remove any erroneous or incomplete records;
The defining outliers includes: setting a normal operating range of data characteristics of the detection data set, and regarding any value exceeding the range as abnormal; for example, setting the normal operating range of the tube voltage value to be between 80kV and 165kV, any value outside this range will be considered abnormal; the normal range of the tube current value is set to 9mA to 11mA, and exceeding this range will be regarded as abnormal; the normal range of temperatures is set to 20 ℃ to 28 ℃, and temperatures outside this range will be regarded as abnormal.
The strategy for processing the missing value comprises the following steps: if the missing proportion is lower than the set proportion, deleting the record containing the missing value; if the missing proportion is higher than the set proportion, using an average value for the continuous variable or using a mode filling missing value for the classified variable; for example, if there is a missing value in less than 5% of the records, the record containing the missing value is deleted.
The processing of the error data comprises: any data that deviates significantly from the normal physical range (e.g., drive voltage values less than 0 or extremely high values) will be considered erroneous data and removed from the dataset.
The processing of the repeated data comprises the following steps: if duplicate records are found to be completely consistent, one record will be kept, and the rest of duplicate items will be deleted; this is based on the assumption that repeated items do not carry additional information.
All data converted by the data keep the original measurement unit, and the conversion of the unit is not needed; however, for subsequent anomaly detection analysis, all continuous variables (e.g., voltage values, current values, temperature, etc.) will be Z-score normalized.
Data cleaning is performed according to the settings, assuming that only three characteristics of tube voltage, tube current, and temperature are cleaned, tube voltage values (kV)): only values in the range of 95kV to 105kV were retained, values outside this range being marked as abnormal. Tube current value (mA)): between 9mA and 11mA is considered normal, otherwise marked as abnormal. Check temperature (°c)): a range of 20 ℃ to 28 ℃ is normal, and an out-of-range flag is abnormal.
The data normalization includes: all detection data are converted into a unified format and range.
The invention adopts Z-score standardization, and is suitable for continuous variables such as tube voltage value, tube current value and temperature. Calculating a mean value (μ) and a standard deviation (σ); the mean (μ) is the average of all data points. The standard deviation (σ) is a measure of the distribution of data points, representing the degree of deviation of the data points from the mean.
The calculation formula for Z-score normalization is:
Z= (X−μ)/(σ) ;
where X is the raw data point.
Further, the method of step S223 that each partition corresponds to an internal node of the decision tree includes the steps of:
each node stores the feature and the segmentation point selected by segmentation;
the left subtree of the node contains data points with values on the feature less than or equal to the cut point;
The right subtree of the node contains data points with values on the feature that are greater than the cut point.
Further, the step S224 of using the anomaly score as a basis for determining whether the data point is anomaly includes:
if a data point is isolated at the first split, then the path length of the data point is 1;
if a data point is isolated at the second split, then the path length of the data point is 2;
and so on until a leaf node is reached.
Outliers typically have shorter path lengths because they are more easily segmented because of their significant differences in value over certain features from common points.
The present invention also provides an abnormal data diagnostic apparatus for performing the abnormal data detection method as described above, comprising:
acquisition part: the X-ray thickness measuring device is used for collecting detection data from the X-ray thickness measuring device in real time; the detection data comprises at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness, a temperature and related time data;
Analysis unit: the method is used for carrying out anomaly detection on the detection data and finding out anomaly detection data information;
a storage unit: the method is used for storing the abnormal detection data information, so that the historical data can be traced and further analyzed;
an output unit: for outputting abnormality detection data information; a diagnostic report is generated that includes detailed results of the anomaly detection and the fault cause analysis.
Further, the analysis section includes:
and a data preprocessing module: the data preprocessing unit is used for preprocessing the detection data;
An abnormality detection module: and the method is used for detecting the abnormality of the preprocessed data.
Further, the abnormality detection module includes:
A detection data set unit: for constructing a detection dataset; the data characteristics of the detection data set comprise at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness and a temperature;
Random segmentation unit: for randomly selecting a feature from all features of the detection dataset; randomly selecting a segmentation point on the selected feature; the data set is segmented into two subsets according to the selected features and segmentation points: subset 1, subset 2;
Iterative segmentation unit: the method comprises the steps of (1) regarding each of two subsets obtained by segmentation as a new data set, repeating the random segmentation process of the step S222 on each subset to carry out iterative segmentation, and carrying out random segmentation on the data set by recursion in the iterative process to isolate abnormal points step by step; in the iterative process, the method for gradually isolating the outliers by recursively randomly segmenting the data set comprises the following steps: creating a decision tree, wherein each partition corresponds to one internal node of the decision tree until the data points are completely isolated or the maximum depth of the decision tree is reached;
Scoring unit: the anomaly degree of each data point is judged through the decision tree; for each data point, defining the number of edges from the root node to the leaf node where the data point is located as the path length of the data point; calculating the anomaly score of each data point based on a scoring mechanism of the path length, and taking the anomaly score as a basis for judging whether the data point is anomaly; calculating the anomaly score of each data point according to the path length, and if the anomaly score of one data point is close to 1, indicating that the average path length of the data point is very short, and the high probability is an anomaly point; if the anomaly score of a data point is close to 0, the average path length of the data point is long, and the high probability is a common point;
Abnormal point identification means: the data points with scores lower than a set threshold value are identified as abnormal data points according to the scoring condition of the calculated abnormal scores; performing pattern analysis on the identified abnormal data points; and deducing the fault reason according to the abnormal mode and the operation rule of the equipment.
The invention also provides a radiation thickness gauge using the anomaly data diagnostic device as described above.
The present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the abnormal data detection method as described above.
The present invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the abnormal data detection method as described above when executing the program.
Compared with the prior art, the invention has the beneficial effects that:
The method can automatically identify the abnormality from the characteristic data without depending on a preset statistical threshold, and improves the sensitivity and accuracy to the abnormal condition; the invention is different from a statistical method depending on specific distribution, and the data is not required to follow certain specific distribution in advance, so that the invention is more flexible in changeable practical application; the abnormal state is identified by evaluating the isolation easiness of the data points, so that the data points are more dynamic and can adapt to the change of various data modes; the X-ray thickness measuring device can be monitored in real time and rapidly subjected to fault diagnosis, potential faults of equipment can be found in time, technicians can be helped to analyze fault reasons, maintenance strategies are optimized, and reliability and maintenance efficiency of the equipment are effectively improved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention.
In the drawings:
FIG. 1 is a schematic diagram showing the components of an anomaly data diagnostic device according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an abnormal data detection method according to an embodiment of the present invention;
FIG. 3 is a flowchart showing the process of the analysis unit according to the embodiment of the present invention;
FIG. 4 is a flowchart of an embodiment of an abnormal data detection method according to the present invention;
FIG. 5 is a flow chart of an abnormal data detection method of the present invention;
FIG. 6 is a flow chart of a method for anomaly detection of detected data according to the present invention;
FIG. 7 is a flow chart of anomaly detection for preprocessed data according to the present invention;
fig. 8 is a schematic diagram of a computer device according to an embodiment of the invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and products consistent with some aspects of the disclosure as detailed in the appended claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
The embodiment of the invention provides an abnormal data detection method, which is shown in fig. 5, and comprises the following steps:
S1, acquiring detection data from an X-ray thickness measuring device in real time; the detection data comprises at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness, a temperature and related time data;
the detection data collected in this example are shown in table 1:
TABLE 1
S2, carrying out anomaly detection on the detection data, and finding out anomaly detection data information; the method for abnormality detection of detection data includes the steps of (see fig. 6):
s21, carrying out data preprocessing on the detection data; the data preprocessing comprises the following steps: data cleaning and data standardization;
Wherein the data cleansing comprises: defining abnormal values, strategies for processing missing values, processing error data, processing repeated data and converting data; checking the detected data to remove any erroneous or incomplete records;
The defining outliers includes: setting a normal operating range of data characteristics of the detection data set, and regarding any value exceeding the range as abnormal; for example, setting the normal operating range of the tube voltage value to be between 80kV and 165kV, any value outside this range will be considered abnormal; the normal range of the tube current value is set to 9mA to 11mA, and exceeding this range will be regarded as abnormal; the normal range of temperatures is set to 20 ℃ to 28 ℃, and temperatures outside this range will be regarded as abnormal.
The strategy for processing the missing value comprises the following steps: if the missing proportion is lower than the set proportion, deleting the record containing the missing value; if the missing proportion is higher than the set proportion, using an average value for the continuous variable or using a mode filling missing value for the classified variable; for example, if there is a missing value in less than 5% of the records, the record containing the missing value is deleted.
The processing of the error data comprises: any data that deviates significantly from the normal physical range (e.g., drive voltage values less than 0 or extremely high values) will be considered erroneous data and removed from the dataset.
The processing of the repeated data comprises the following steps: if duplicate records are found to be completely consistent, one record will be kept, and the rest of duplicate items will be deleted; this is based on the assumption that repeated items do not carry additional information.
All data converted by the data keep the original measurement unit, and the conversion of the unit is not needed; however, for subsequent anomaly detection analysis, all continuous variables (e.g., voltage values, current values, temperature, etc.) will be Z-score normalized.
Data cleaning is performed according to the settings, assuming that only three characteristics of tube voltage, tube current, and temperature are cleaned, tube voltage values (kV)): only values in the range of 95kV to 105kV were retained, values outside this range being marked as abnormal. Tube current value (mA)): between 9mA and 11mA is considered normal, otherwise marked as abnormal. Check temperature (°c)): a range of 20 ℃ to 28 ℃ is normal, and an out-of-range flag is abnormal.
The data normalization includes: all detection data are converted into a unified format and range.
In this example, Z-score normalization was used to adapt to the continuous variables of tube voltage, tube current and temperature.
Calculating a mean value (μ) and a standard deviation (σ); the mean (μ) is the average of all data points. The standard deviation (σ) is a measure of the distribution of data points, representing the degree of deviation of the data points from the mean.
The mean and standard deviation were calculated for three characteristics of tube voltage, tube current, and temperature, as shown in table 2:
TABLE 2
The calculation formula for Z-score normalization is:
Z= (X−μ)/(σ) ;
where X is the raw data point.
The mean and standard deviation of tube voltage, tube current, temperature were normalized in this example as shown in table 3:
TABLE 3 Table 3
S22, performing anomaly detection on the preprocessed data, wherein the anomaly detection comprises the following steps (see the figure 7):
S221, constructing a detection data set; the data characteristics of the detection data set comprise at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness and a temperature;
S222, randomly selecting one feature from all features of the detection data set; randomly selecting a segmentation point on the selected feature; the data set is segmented into two subsets according to the selected features and segmentation points: subset 1, subset 2;
In this embodiment, a slicing point is randomly selected on a selected feature (such as a "tube voltage value"), and the selection range of the slicing point is between the minimum value and the maximum value of the feature; for example, if the "tube voltage value" is in the range of 80kV to 150kV, it is possible to randomly select 120kV as the dividing point;
in an embodiment of the invention, subset 1 contains all data points with values less than or equal to the cut-off point (120 kV) on the selected feature ("tube voltage value");
Subset 2: all data points contained on the selected feature ("tube voltage value") with values greater than the cut-off point (120 kV);
S223, regarding each of the two subsets obtained by segmentation as a new data set, repeating the random segmentation process of the step S222 on each subset for iterative segmentation, and carrying out random segmentation on the data set by recursion in the iterative process to isolate abnormal points step by step;
Randomly selecting a feature from all features of the subset; randomly selecting a segmentation point on the selected feature; dividing the subset into two smaller subsets according to the selected features and the dividing points;
the segmentation process is performed recursively, and more subsets are generated for each segmentation to form a multi-level segmentation structure.
Each segmentation aims to further isolate data points, especially outliers. By repeated random partitioning, outliers can be more quickly isolated into small subsets because their values on certain features differ significantly from common points.
For example, assume that a "tube voltage value" feature is selected at the time of the first division, and 120kV is taken as a division point. On the resulting subset, the "current sense signal" feature is selected for the second segmentation. If an outlier has a significantly different value on both features than the normal point, it is likely to have been isolated to a small subset containing only itself after two divisions.
In the iterative process of the step S223, the method for gradually isolating the outliers by recursively randomly dividing the data set includes:
Creating a decision tree, wherein each partition corresponds to an internal node of the decision tree until the data points are completely isolated (i.e., cannot be further partitioned) or the maximum depth of the decision tree is reached;
for example, assume that the "tube voltage value" feature is selected for a certain division, and the division point is 120kV. The corresponding node then stores the information of this feature and the cut point. All data points with "tube voltage values". Ltoreq.120 kV are assigned to the left subtree, while data points with "tube voltage values". Ltoreq.120 kV are assigned to the right subtree.
The method for mapping each partition to an internal node of the decision tree comprises the steps of:
each node stores the feature and the segmentation point selected by segmentation;
the left subtree of the node contains data points with values on the feature less than or equal to the cut point;
The right subtree of the node contains data points with values on the feature that are greater than the cut point.
S224, judging the abnormality degree of each data point through a decision tree; for each data point, defining the number of edges from the root node to the leaf node where the data point is located as the path length of the data point; calculating the anomaly score of each data point based on a scoring mechanism of the path length, and taking the anomaly score as a basis for judging whether the data point is anomaly;
The step of taking the anomaly score as a basis for judging whether the data point is anomalous comprises the following steps:
if a data point is isolated at the first split, then the path length of the data point is 1;
if a data point is isolated at the second split, then the path length of the data point is 2;
and so on until a leaf node is reached.
Outliers typically have shorter path lengths because they are more easily segmented because of their significant differences in value over certain features from common points.
According to the path length, calculating the anomaly score of each data point, wherein the anomaly score is calculated according to the following formula:
wherein E (h (x)) is a data point;
x is the average path length of all trees;
c (n) is a normalization factor for the average unsuccessful search depth given the number of data points n;
Is a normalized path length such that the anomaly score is between 0 and 1;
The normalized results of this example are shown in table 4:
TABLE 4 Table 4
Data points with an anomaly score of approximately 1 are considered more normal, while data points with a score of approximately-1 are considered anomalous. According to this rule, data points 1,2, and 8 are detected as outliers. Other data points are considered normal.
Outliers are typically more easily isolated and therefore have shorter path lengths in the decision tree;
If the anomaly score of a data point is close to 1, the average path length of the data point is short, and the high probability is an anomaly point; if the anomaly score of a data point is close to 0, the average path length of the data point is long, and the high probability is a common point;
s225, according to the scoring condition of the anomaly score obtained through calculation, identifying the data points with scores lower than a set threshold as anomaly data points; performing pattern analysis on the identified abnormal data points; deducing the fault reason according to the abnormal mode and the operation rule of the equipment;
Through the detailed analysis flow, abnormal data in the X-ray thickness measuring device can be effectively identified, and accurate references are provided for subsequent maintenance and fault elimination;
The algorithm identifies data points with shorter path lengths (i.e., higher anomaly scores) as anomalies. In the example, data points 1, 2, and 8 have a lower anomaly score, indicating that they are more anomalous than the other data points. This means that in a comprehensive analysis of the tube voltage, tube current and temperature, these points exhibit characteristics different from most data points.
For the identified outlier data points, their patterns may be further analyzed. For example, if all outliers are abnormally high or abnormally low in the value of the tube voltage dimension, this may indicate that a particular factor related to the tube voltage causes these anomalies. Similarly, this also provides important mode information if the outliers are concentrated primarily in a particular tube current or temperature range.
Based on the abnormal mode and the operational rules of the device, possible causes of the fault can be further inferred. For example, if the anomaly is mainly due to an abnormal tube voltage value, this may be related to an unstable power supply; if the tube current is abnormal, the internal circuit may be problematic; and temperature anomalies may be directed to failure of the heat dissipating system. By the method, not only can the abnormality be identified, but also the root cause of the abnormality can be inferred, and basis is provided for maintenance and fault elimination.
S3, storing abnormal detection data information, so that historical data can be traced and further analyzed conveniently;
S4, outputting abnormal detection data information; a diagnostic report is generated that includes detailed results of the anomaly detection and the fault cause analysis.
The diagnostic report may be presented visually through an interface or may be exported as a document for further analysis by a technician. Such an analysis process allows not only to stay on the surface of the anomaly detection, but to understand deeply the patterns and possible reasons behind the anomaly, providing more comprehensive support for subsequent decisions.
Fig. 4 shows a specific implementation flow of the abnormal data detection method of the present embodiment.
The embodiment of the present invention also provides an abnormal data diagnostic apparatus for performing the abnormal data detection method as described above, including (see fig. 1):
Acquisition part: the X-ray thickness measuring device is used for collecting detection data from the X-ray thickness measuring device in real time; the detection data comprise driving voltage value, driving current value, tube voltage value, tube current value, voltage detection signal or current detection signal, plate and strip thickness, temperature and related time data;
analysis unit: for performing abnormality detection (see fig. 3) on the detection data, finding abnormality detection data information;
a storage unit: the method is used for storing the abnormal detection data information, so that the historical data can be traced and further analyzed;
An output unit: for outputting abnormality detection data information (see fig. 2); a diagnostic report is generated that includes detailed results of the anomaly detection and the fault cause analysis.
The analysis unit includes:
and a data preprocessing module: the data preprocessing unit is used for preprocessing the detection data;
An abnormality detection module: and the method is used for detecting the abnormality of the preprocessed data.
The abnormality detection module includes:
a detection data set unit: for constructing a detection dataset; the data characteristics of the detection data set comprise a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness and a temperature;
Random segmentation unit: for randomly selecting a feature from all features of the detection dataset; randomly selecting a segmentation point on the selected feature; the data set is segmented into two subsets according to the selected features and segmentation points: subset 1, subset 2;
Iterative segmentation unit: the method comprises the steps of (1) regarding each of two subsets obtained by segmentation as a new data set, repeating the random segmentation process of the step S222 on each subset to carry out iterative segmentation, and carrying out random segmentation on the data set by recursion in the iterative process to isolate abnormal points step by step; in the iterative process, the method for gradually isolating the outliers by recursively randomly segmenting the data set comprises the following steps: creating a decision tree, wherein each partition corresponds to one internal node of the decision tree until the data points are completely isolated or the maximum depth of the decision tree is reached;
Scoring unit: the anomaly degree of each data point is judged through the decision tree; for each data point, defining the number of edges from the root node to the leaf node where the data point is located as the path length of the data point; calculating the anomaly score of each data point based on a scoring mechanism of the path length, and taking the anomaly score as a basis for judging whether the data point is anomaly; calculating the anomaly score of each data point according to the path length, and if the anomaly score of one data point is close to 1, indicating that the average path length of the data point is very short, and the high probability is an anomaly point; if the anomaly score of a data point is close to 0, the average path length of the data point is long, and the high probability is a common point;
Abnormal point identification means: the data points with scores lower than a set threshold value are identified as abnormal data points according to the scoring condition of the calculated abnormal scores; performing pattern analysis on the identified abnormal data points; and deducing the fault reason according to the abnormal mode and the operation rule of the equipment.
The embodiment of the invention also provides a radiation thickness gauge which uses the abnormal data diagnostic device.
The embodiment of the invention also provides a computer device, and fig. 8 is a schematic structural diagram of the computer device provided by the embodiment of the invention; referring to fig. 8 of the drawings, the computer apparatus includes: input means 23, output means 24, memory 22 and processor 21; the memory 22 is configured to store one or more programs; when the one or more programs are executed by the one or more processors 21, the one or more processors 21 are caused to implement the abnormal data detection method as provided in the above-described embodiment; wherein the input device 23, the output device 24, the memory 22 and the processor 21 may be connected by a bus or otherwise, for example in fig. 8.
The memory 22 is used as a readable storage medium of a computing device, and can be used for storing a software program and a computer executable program, such as program instructions corresponding to the abnormal data detection method according to the embodiment of the present invention; the memory 22 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the device, etc.; in addition, memory 22 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device; in some examples, memory 22 may further comprise memory located remotely from processor 21, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 23 is operable to receive input numeric or character information and to generate key signal inputs relating to user settings and function control of the device; the output device 24 may include a display device such as a display screen.
The processor 21 executes various functional applications of the apparatus and data processing, that is, implements the abnormal data detection method described above, by running software programs, instructions, and modules stored in the memory 22.
The computer device provided by the above embodiment can be used for executing the abnormal data detection method provided by the above embodiment, and has corresponding functions and beneficial effects.
The embodiment of the present invention also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing the abnormal data detection method as provided in the above embodiment, the storage medium being any of various types of memory devices or storage devices, the storage medium comprising: mounting media such as CD-ROM, floppy disk or tape devices; computer system memory or random access memory, such as DRAM, DDRRAM, SRAM, EDORAM, rambus (Rambus) RAM, or the like; nonvolatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc.; the storage medium may also include other types of memory or combinations thereof; in addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a second, different computer system, the second computer system being connected to the first computer system through a network (such as the internet); the second computer system may provide program instructions to the first computer for execution. Storage media includes two or more storage media that may reside in different locations (e.g., in different computer systems connected by a network). The storage medium may store program instructions (e.g., embodied as a computer program) executable by one or more processors.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the method for detecting abnormal data described in the above embodiments, and may also perform the related operations in the method for detecting abnormal data provided in any embodiment of the present invention.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
The foregoing description is only of the preferred embodiments of the invention and is not intended to limit the invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The abnormal data detection method is characterized by comprising the following steps:
S1, acquiring detection data from an X-ray thickness measuring device in real time; the detection data comprises at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness, a temperature and related time data;
S2, carrying out anomaly detection on the detection data, and finding out anomaly detection data information; the method for detecting the abnormality of the detection data comprises the following steps:
s21, carrying out data preprocessing on the detection data;
s22, performing anomaly detection on the preprocessed data, wherein the anomaly detection comprises the following steps of:
S221, constructing a detection data set; the data characteristics of the detection data set comprise at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness and a temperature;
S222, randomly selecting one feature from all features of the detection data set; randomly selecting a segmentation point on the selected feature; the data set is segmented into two subsets according to the selected features and segmentation points: subset 1, subset 2;
S223, regarding each of the two subsets obtained by segmentation as a new data set, repeating the random segmentation process of the step S222 on each subset for iterative segmentation, and carrying out random segmentation on the data set by recursion in the iterative process to isolate abnormal points step by step;
in the iterative process of the step S223, the method for gradually isolating the outliers by recursively randomly dividing the data set includes:
Creating a decision tree, wherein each partition corresponds to one internal node of the decision tree until the data points are completely isolated or the maximum depth of the decision tree is reached;
S224, judging the abnormality degree of each data point through a decision tree; for each data point, defining the number of edges from the root node to the leaf node where the data point is located as the path length of the data point; calculating the anomaly score of each data point based on a scoring mechanism of the path length, and taking the anomaly score as a basis for judging whether the data point is anomaly;
According to the path length, calculating the anomaly score of each data point, wherein the anomaly score is calculated according to the following formula:
wherein E (h (x)) is a data point;
x is the average path length of all trees;
c (n) is a normalization factor for the average unsuccessful search depth given the number of data points n;
Is a normalized path length such that the anomaly score is between 0 and 1;
If the anomaly score of a data point is close to 1, the average path length of the data point is short, and the high probability is an anomaly point; if the anomaly score of a data point is close to 0, the average path length of the data point is long, and the high probability is a common point;
s225, according to the scoring condition of the anomaly score obtained through calculation, identifying the data points with scores lower than a set threshold as anomaly data points; performing pattern analysis on the identified abnormal data points; deducing the fault reason according to the abnormal mode and the operation rule of the equipment;
S3, storing abnormal detection data information, so that historical data can be traced and further analyzed conveniently;
S4, outputting abnormal detection data information; a diagnostic report is generated that includes detailed results of the anomaly detection and the fault cause analysis.
2. The abnormal data detection method according to claim 1, wherein the data preprocessing of the S21 step includes: data cleaning and data standardization;
Wherein the data cleansing comprises: defining abnormal values, strategies for processing missing values, processing error data, processing repeated data and converting data; checking the detected data to remove any erroneous or incomplete records;
the defining outliers includes: setting a normal operating range of data characteristics of the detection data set, and regarding any value exceeding the range as abnormal;
The strategy for processing the missing value comprises the following steps: if the missing proportion is lower than the set proportion, deleting the record containing the missing value; if the missing proportion is higher than the set proportion, using an average value for the continuous variable or using a mode filling missing value for the classified variable;
The processing of the error data comprises: any data that deviates significantly from the normal physical range will be considered erroneous data and removed from the dataset;
the processing of the repeated data comprises the following steps: if duplicate records are found to be completely consistent, one record will be kept, and the rest of duplicate items will be deleted;
All data converted by the data keep the original measurement unit, and the conversion of the unit is not needed;
The data normalization includes: all detection data are converted into a unified format and range.
3. The abnormal data detection method according to claim 1, wherein the method of S223 that each partition corresponds to one internal node of the decision tree comprises the steps of:
each node stores the feature and the segmentation point selected by segmentation;
the left subtree of the node contains data points with values on the feature less than or equal to the cut point;
The right subtree of the node contains data points with values on the feature that are greater than the cut point.
4. The abnormal data detection method according to claim 1, wherein the step S224 of using the abnormality score as a basis for determining whether the data point is abnormal comprises:
if a data point is isolated at the first split, then the path length of the data point is 1;
if a data point is isolated at the second split, then the path length of the data point is 2;
and so on until a leaf node is reached.
5. An abnormal data diagnostic apparatus, wherein the abnormal data detecting method according to any one of claims 1 to 4 is performed, comprising:
acquisition part: the X-ray thickness measuring device is used for collecting detection data from the X-ray thickness measuring device in real time; the detection data comprises at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness, a temperature and related time data;
Analysis unit: the method is used for carrying out anomaly detection on the detection data and finding out anomaly detection data information;
a storage unit: the method is used for storing the abnormal detection data information, so that the historical data can be traced and further analyzed;
an output unit: for outputting abnormality detection data information; a diagnostic report is generated that includes detailed results of the anomaly detection and the fault cause analysis.
6. The abnormal data diagnostic apparatus according to claim 5, wherein the analysis section includes:
and a data preprocessing module: the data preprocessing unit is used for preprocessing the detection data;
An abnormality detection module: and the method is used for detecting the abnormality of the preprocessed data.
7. The anomaly data diagnostic device of claim 6, wherein the anomaly detection module comprises:
A detection data set unit: for constructing a detection dataset; the data characteristics of the detection data set comprise at least one of a driving voltage value, a driving current value, a tube voltage value, a tube current value, a voltage detection signal or a current detection signal, a plate and strip thickness and a temperature;
Random segmentation unit: for randomly selecting a feature from all features of the detection dataset; randomly selecting a segmentation point on the selected feature; the data set is segmented into two subsets according to the selected features and segmentation points: subset 1, subset 2;
Iterative segmentation unit: the method comprises the steps of (1) regarding each of two subsets obtained by segmentation as a new data set, repeating the random segmentation process of the step S222 on each subset to carry out iterative segmentation, and carrying out random segmentation on the data set by recursion in the iterative process to isolate abnormal points step by step; in the iterative process, the method for gradually isolating the outliers by recursively randomly segmenting the data set comprises the following steps: creating a decision tree, wherein each partition corresponds to one internal node of the decision tree until the data points are completely isolated or the maximum depth of the decision tree is reached;
Scoring unit: the anomaly degree of each data point is judged through the decision tree; for each data point, defining the number of edges from the root node to the leaf node where the data point is located as the path length of the data point; calculating the anomaly score of each data point based on a scoring mechanism of the path length, and taking the anomaly score as a basis for judging whether the data point is anomaly; calculating the anomaly score of each data point according to the path length, and if the anomaly score of one data point is close to 1, indicating that the average path length of the data point is very short, and the high probability is an anomaly point; if the anomaly score of a data point is close to 0, the average path length of the data point is long, and the high probability is a common point;
Abnormal point identification means: the data points with scores lower than a set threshold value are identified as abnormal data points according to the scoring condition of the calculated abnormal scores; performing pattern analysis on the identified abnormal data points; and deducing the fault reason according to the abnormal mode and the operation rule of the equipment.
8. A radiation thickness gauge, characterized in that an abnormality data diagnostic device according to any one of claims 5-7 is used.
9. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the abnormal data detection method according to any one of claims 1-4.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the abnormal data detection method according to any one of claims 1-4 when the program is executed by the processor.
CN202410411049.XA 2024-04-08 2024-04-08 Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge Active CN118094107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410411049.XA CN118094107B (en) 2024-04-08 2024-04-08 Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410411049.XA CN118094107B (en) 2024-04-08 2024-04-08 Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge

Publications (2)

Publication Number Publication Date
CN118094107A CN118094107A (en) 2024-05-28
CN118094107B true CN118094107B (en) 2024-06-21

Family

ID=91150734

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410411049.XA Active CN118094107B (en) 2024-04-08 2024-04-08 Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge

Country Status (1)

Country Link
CN (1) CN118094107B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110073301A (en) * 2017-08-02 2019-07-30 强力物联网投资组合2016有限公司 The detection method and system under data collection environment in industrial Internet of Things with large data sets
CN116010885A (en) * 2022-12-21 2023-04-25 重庆邮电大学 Method and system for detecting abnormal space-time data of vehicle under long-sequence condition

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2681496B1 (en) * 2011-03-02 2019-03-06 Carrier Corporation Spm fault detection and diagnostics algorithm
FR3111710B1 (en) * 2020-06-19 2022-12-09 Airbus Method and system for estimating an angular deviation of a reference guidance axis, a position and a speed of an aircraft.
KR102632527B1 (en) * 2021-09-08 2024-02-02 울산대학교 산학협력단 System and method for fault diagnosis of fuel cell energy management system based on digital twin
KR102505112B1 (en) * 2022-09-23 2023-03-03 탑인더스트리(주) Control panel having abnormal operating detection function using artificial intelligence and operation control method thereof
CN117251812A (en) * 2023-09-22 2023-12-19 河南博兆电子科技有限公司 High-voltage power line operation fault detection method based on big data analysis
CN117131454B (en) * 2023-10-23 2024-01-12 北京华力兴科技发展有限责任公司 X-ray thickness measurement abnormal data monitoring method
CN117349782B (en) * 2023-12-06 2024-02-20 湖南嘉创信息科技发展有限公司 Intelligent data early warning decision tree analysis method and system
CN117786373B (en) * 2024-02-28 2024-05-03 山东鑫林纸制品有限公司 Equipment operation diagnosis system based on big data corrugated paper processing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110073301A (en) * 2017-08-02 2019-07-30 强力物联网投资组合2016有限公司 The detection method and system under data collection environment in industrial Internet of Things with large data sets
CN116010885A (en) * 2022-12-21 2023-04-25 重庆邮电大学 Method and system for detecting abnormal space-time data of vehicle under long-sequence condition

Also Published As

Publication number Publication date
CN118094107A (en) 2024-05-28

Similar Documents

Publication Publication Date Title
CN110046453B (en) Service life prediction method of laser radar
CN113282461B (en) Alarm identification method and device for transmission network
US20150219530A1 (en) Systems and methods for event detection and diagnosis
AU2019275633B2 (en) System and method of automated fault correction in a network environment
CN111931834B (en) Method, equipment and storage medium for detecting abnormal flow data in extrusion process of aluminum profile based on isolated forest algorithm
CN111078513A (en) Log processing method, device, equipment, storage medium and log alarm system
CN117131110B (en) Method and system for monitoring dielectric loss of capacitive equipment based on correlation analysis
CN108306997B (en) Domain name resolution monitoring method and device
CN117079211A (en) Safety monitoring system and method for network machine room
CN115905990A (en) Transformer oil temperature abnormity monitoring method based on density aggregation algorithm
CN116975938B (en) Sensor data processing method in product manufacturing process
CN118094107B (en) Abnormal data detection method, abnormal data diagnostic device, and radiation thickness gauge
CN111597510B (en) Power transmission and transformation operation detection data quality assessment method and system
CN116820821A (en) Disk failure detection method, apparatus, electronic device and computer readable storage medium
CN115495274B (en) Exception handling method based on time sequence data, network equipment and readable storage medium
CN116595353A (en) Remote fault diagnosis and intelligent decision system for camera
CN113033673B (en) Training method and system for motor working condition abnormity detection model
CN113836203A (en) Network data diagnosis detection analysis system
CN115511106B (en) Method, device and readable storage medium for generating training data based on time sequence data
CN114003479B (en) Fault log pushing method, computer and storage medium
CN116192612B (en) System fault monitoring and early warning system and method based on log analysis
CN116517781A (en) State monitoring method, medium, system and wind generating set of main shaft bearing
CN117194049B (en) Cloud host intelligent behavior analysis method and system based on machine learning algorithm
US20230409421A1 (en) Anomaly detection in computer systems
JP2017204107A (en) Data analytic method, and system and device therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant