CN116992391B - Hard carbon process environment-friendly monitoring data acquisition and processing method - Google Patents

Hard carbon process environment-friendly monitoring data acquisition and processing method Download PDF

Info

Publication number
CN116992391B
CN116992391B CN202311253322.2A CN202311253322A CN116992391B CN 116992391 B CN116992391 B CN 116992391B CN 202311253322 A CN202311253322 A CN 202311253322A CN 116992391 B CN116992391 B CN 116992391B
Authority
CN
China
Prior art keywords
data
value
initial
updated
suspected abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311253322.2A
Other languages
Chinese (zh)
Other versions
CN116992391A (en
Inventor
杨黎军
于淼淼
司洪宇
孙康
胡涵
田其帅
杨坤
高洪超
罗静文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Guanbaolin Activated Carbon Co ltd
Original Assignee
Qingdao Guanbaolin Activated Carbon Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Guanbaolin Activated Carbon Co ltd filed Critical Qingdao Guanbaolin Activated Carbon Co ltd
Priority to CN202311253322.2A priority Critical patent/CN116992391B/en
Publication of CN116992391A publication Critical patent/CN116992391A/en
Application granted granted Critical
Publication of CN116992391B publication Critical patent/CN116992391B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/0004Gaseous mixtures, e.g. polluted air
    • G01N33/0009General constructional details of gas analysers, e.g. portable test equipment
    • G01N33/0062General constructional details of gas analysers, e.g. portable test equipment concerning the measuring method, e.g. intermittent, or the display, e.g. digital

Abstract

The invention relates to the technical field of environmental pollution data processing, in particular to a hard carbon process environmental protection monitoring data acquisition and processing method. The method comprises the steps of obtaining exhaust gas concentration data; setting an initial k value of an outlier detection algorithm, acquiring an initial outlier factor of each exhaust gas concentration data, and determining initial normal data and suspected abnormal data; updating the initial k value, acquiring real abnormal data under each updated k value according to the distribution condition of the initial normal data and the suspected abnormal data in the update neighborhood under each updated k value, and determining the optimal updated k value; and acquiring abnormal exhaust gas concentration data according to the optimal updated k value. According to the method, the optimal k value of the outlier detection algorithm is obtained through the number of the real outliers under each updated k value, so that the concentration of the abnormal waste gas is accurately monitored, and the damage caused by the waste gas is timely prevented.

Description

Hard carbon process environment-friendly monitoring data acquisition and processing method
Technical Field
The invention relates to the technical field of environmental pollution data processing, in particular to a hard carbon process environmental protection monitoring data acquisition and processing method.
Background
The hard carbon process is a process for converting natural organic matters such as coal, petroleum and the like into high-value carbon materials, and has important economic value and industrial application prospect. Harmful waste gas is generated in the hard carbon process production process, and serious irreversible damage is caused to the environment and human health.
In order to discover abnormal exhaust gas concentration data in real time and prevent damage caused by harmful exhaust gas, the existing method detects each exhaust gas concentration data through an outlier detection algorithm and monitors whether the exhaust gas concentration data is abnormal or not, but because a k value in the outlier detection algorithm is artificially set according to experience, the detection accuracy of the outlier detection algorithm is not high easily, normal exhaust gas concentration data is erroneously detected as abnormal exhaust gas concentration data, so that the abnormal exhaust gas concentration data is inaccurate to monitor and damage caused by exhaust gas cannot be prevented in time.
Disclosure of Invention
In order to solve the technical problem of inaccurate k value in an outlier detection algorithm and inaccurate monitoring of abnormal exhaust gas concentration data, the invention aims to provide a hard carbon process environment-friendly monitoring data acquisition and processing method, which adopts the following specific technical scheme:
the invention provides a hard carbon process environment-friendly monitoring data acquisition and processing method, which comprises the following steps:
acquiring exhaust gas concentration data at different moments in a set time period;
setting an initial k value of an outlier detection algorithm, and acquiring a local outlier factor of each exhaust gas concentration data under the initial k value as an initial outlier factor; performing anomaly detection on the exhaust gas concentration data according to the initial outlier factors, and determining initial normal data and suspected abnormal data according to the abnormal distribution condition of the data in the anomaly detection results;
updating the initial k value to obtain at least two updated k values, obtaining updated outlier factors of each waste gas concentration data under each updated k value, and screening out suspected abnormal areas under each updated k value according to the distribution condition of the initial normal data and the suspected abnormal data in the updated neighborhood of each updated k value of each initial normal data;
obtaining a normal data set of each suspected abnormal data in the suspected abnormal region under each updated k value according to the updated outlier factor and the position distribution condition of each suspected abnormal data in the suspected abnormal region under each updated k value; obtaining real abnormal data under each updated k value according to the difference of the initial outlier factors and the difference of the updated outlier factors between the suspected abnormal data under each updated k value and each element in the corresponding normal data set, and determining an optimal updated k value;
and acquiring abnormal exhaust gas concentration data according to the optimal updated k value.
Further, the method for abnormality detection of the exhaust gas concentration data according to the initial outlier factor comprises the following steps:
acquiring the average value of the initial outlier factors as an initial target average value;
when the initial outlier factor is larger than or equal to the initial target mean value, the corresponding exhaust gas concentration data is used as first abnormal data;
when the initial outlier factor is smaller than the initial target mean, the corresponding exhaust gas concentration data is used as first normal data.
Further, the method for determining the initial normal data and the suspected abnormal data according to the abnormal distribution condition of the data in the abnormal detection result comprises the following steps:
acquiring an initial neighborhood of each first normal data under an initial k value as an initial normal neighborhood;
when each neighborhood data in the initial normal neighborhood is first normal data, taking the central first normal data of the initial normal neighborhood as initial normal data;
when at least one first abnormal data exists in each neighborhood data in the initial normal neighborhood, taking the central first normal data of the initial normal neighborhood as initial abnormal data; wherein the first abnormal data is also used as initial abnormal data.
Further, the method for acquiring the suspected abnormal region comprises the following steps:
and for any initial normal data, acquiring newly added neighborhood data of the initial normal data in an update neighborhood of each update k value, and when at least one of the newly added neighborhood data is suspected abnormal data, taking the update neighborhood of the initial normal data in the corresponding update k value as a suspected abnormal region.
Further, the method for obtaining the normal data set of each piece of suspected abnormal data in the suspected abnormal area under each updated k value according to the updated outlier factor and the position distribution condition of each piece of suspected abnormal data in the suspected abnormal area under each updated k value comprises the following steps:
optionally selecting an updated k value as a target k value, and sequencing each suspected abnormal data in the suspected abnormal region according to the sequence from the large to the small of updating outlier factors for any suspected abnormal region under the target k value to obtain a suspected abnormal data sequence of the suspected abnormal region;
regarding the ith suspected abnormal data in the suspected abnormal data sequence, taking a suspected abnormal region containing the ith suspected abnormal data under the target k value as a target region;
taking a set formed by the central initial normal data of each target area as the whole normal data set of the ith suspected abnormal data;
taking the whole normal data set of each piece of suspected abnormal data positioned before the ith piece of suspected abnormal data in the suspected abnormal data sequence as a comparison set;
and removing the repeated initial normal data in the integral normal data set of the ith suspected abnormal data and all the comparison sets to obtain the integral normal data set of the removed ith suspected abnormal data, wherein the integral normal data set of the ith suspected abnormal data is used as the normal data set of the ith suspected abnormal data.
Further, the method for acquiring the real abnormal data comprises the following steps:
acquiring the difference of initial outlier factors between each suspected abnormal data and each element in the corresponding normal data set under each updated k value as a first difference; wherein each element in the normal data set is initial normal data;
acquiring the average value of the first difference as a first average value;
acquiring differences of updating outlier factors between each suspected abnormal data and each element in the corresponding normal data set under each updated k value as second differences;
acquiring the average value of the second difference as a second average value;
obtaining a difference value between a first mean value and a second mean value of each suspected abnormal data as a reference value of the corresponding suspected abnormal data;
when the reference value is larger than a preset reference threshold value, the corresponding suspected abnormal data are real abnormal data.
Further, the method for obtaining the optimal updated k value comprises the following steps:
acquiring the number of real abnormal data under each updated k value as the abnormal number;
and taking the updated k value corresponding to the maximum abnormal number as the optimal updated k value.
Further, the method for acquiring the abnormal exhaust gas concentration data according to the optimal updated k value comprises the following steps:
based on the optimal updated k value, acquiring a local outlier factor of each exhaust gas concentration data by an outlier detection algorithm as an optimal outlier factor;
acquiring the average value of the optimal outlier factors as an optimal average value;
and marking the exhaust gas concentration data corresponding to the optimal outlier factor larger than the optimal average value as abnormal exhaust gas concentration data.
Further, the process of updating the initial k value is as follows: and sequentially increasing the initial k value according to a preset step length, wherein the result of each increase is an updated k value.
Further, the method for acquiring the updated outlier factor comprises the following steps:
and obtaining the local outlier factor of each exhaust gas concentration data at each updated k value as an updated outlier factor.
The invention has the following beneficial effects:
setting an initial k value of an outlier detection algorithm, determining initial normal data and suspected abnormal data, and accurately obtaining an optimal k value in the outlier detection algorithm; therefore, the initial k value is updated according to the preset step length, each updated k value is obtained, the suspected abnormal area under each updated k value is screened out according to the distribution condition of the initial normal data and the suspected abnormal data in the update neighborhood of each updated k value of each initial normal data, and only the suspected abnormal data in the suspected abnormal area are analyzed, so that the efficiency of obtaining the real abnormal data is improved; further, according to the update outlier factor and the position distribution condition of each piece of suspected abnormal data in the suspected abnormal region under each update k value, a normal data set of each piece of suspected abnormal data in the suspected abnormal region under each update k value is obtained, and whether each piece of suspected abnormal data is real abnormal data or not is determined; therefore, according to the difference of the initial outlier factors and the difference of the updated outlier factors between the suspected outlier data under each updated k value and each element in the corresponding normal data set, the real outlier data under each updated k value is accurately determined, the optimal updated k value in the outlier detection algorithm is further determined, the abnormal exhaust gas concentration data is accurately monitored, so that the abnormal exhaust gas concentration data is accurately analyzed, the risk brought by exhaust gas is reduced, and the abnormal exhaust gas concentration is timely prevented.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for collecting and processing environmental monitoring data of a hard carbon process according to an embodiment of the invention.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description refers to the specific implementation, structure, characteristics and effects of the method for collecting and processing environmental protection monitoring data of hard carbon process according to the invention by combining the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention provides a specific scheme of a hard carbon process environment-friendly monitoring data acquisition and processing method, which is specifically described below with reference to the accompanying drawings.
Referring to fig. 1, a flow chart of a method for collecting and processing environmental monitoring data of a hard carbon process according to an embodiment of the invention is shown, the method comprises the following steps:
step S1: and acquiring exhaust gas concentration data at different moments in a set time period.
Specifically, the hard carbon process is a burning process, a large amount of exhaust gas is generated in the production process, in order to monitor the concentration of the exhaust gas discharged in real time, the embodiment of the invention sets the set time period to be one hour, and the time period between the current time and the time corresponding to one hour from the current time can be set by an implementer according to the actual situation, and the duration of the time period is not limited herein. And monitoring the exhaust gas concentration data at each acquisition time in a set time period in real time by using a sensor. In order to improve the efficiency of monitoring abnormal exhaust gas concentration data, the embodiment of the invention collects the exhaust gas concentration data every 5 minutes, and an operator can set the time interval for collecting the exhaust gas concentration data according to actual conditions without limitation.
The aim of the embodiment of the invention is as follows: the k value of the outlier detection algorithm is continuously updated, the local outlier factor of each waste gas concentration data under each updated k value is obtained, the number of the true abnormal data screened out under each updated k value is determined, the optimal k value in the outlier detection algorithm is adaptively obtained, the abnormal waste gas concentration data is accurately obtained, the abnormal waste gas concentration data is timely processed, and the damage caused by waste gas is reduced. The outlier detection algorithm is a known technique, and will not be described herein.
Step S2: setting an initial k value of an outlier detection algorithm, and acquiring a local outlier factor of each exhaust gas concentration data under the initial k value as an initial outlier factor; and carrying out anomaly detection on the exhaust gas concentration data according to the initial outlier factors, and determining initial normal data and suspected abnormal data according to the abnormal distribution condition of the data in the anomaly detection results.
Specifically, in order to more accurately acquire the number of real abnormal data under each updated k value of the outlier detection algorithm, the embodiment of the invention sets the initial k value of the outlier detection algorithm to 2, takes the local outlier factor of each exhaust gas concentration data under the initial k value as a reference, and determines the real abnormal data under each updated k value on the basis. The magnitude of the initial k value may be set by the practitioner according to the actual situation, and is not limited herein. Obtaining a local outlier factor of each waste gas concentration data under an initial k value, taking the local outlier factor as an initial outlier factor, and obtaining a mean value of the initial outlier factor as an initial target mean value; when the initial outlier factor is larger than or equal to the initial target mean value, the corresponding exhaust gas concentration data is used as first abnormal data; when the initial outlier factor is smaller than the initial target mean, the corresponding exhaust gas concentration data is used as first normal data. Because the initial k value is small, the acquired first abnormal data and first normal data may contain both normal exhaust gas concentration data and abnormal exhaust gas concentration data. In order to preliminarily divide normal exhaust gas concentration data and abnormal exhaust gas concentration data under an initial k value to obtain initial normal data only representing the normal exhaust gas concentration data and suspected abnormal data possibly representing the normal exhaust gas concentration data and the abnormal exhaust gas concentration data, the embodiment of the invention obtains an initial neighborhood of each first normal data under the initial k value as an initial normal neighborhood, and when each neighborhood data in the initial normal neighborhood is the first normal data, the first normal data in the center of the corresponding initial normal neighborhood is specified to be the normal exhaust gas concentration data, and the embodiment of the invention takes the first normal data as the initial normal data; when at least one first abnormal data exists in each neighborhood data in the initial normal neighborhood, the abnormal data around the central first normal data corresponding to the initial normal neighborhood is indicated to be gathered, the first normal data may be abnormal exhaust gas concentration data, and the first normal data is used as suspected abnormal data in the embodiment of the invention; since the first abnormal data may be suspected abnormal data, the first abnormal data is also regarded as suspected abnormal data in the embodiment of the present invention. So far, the exhaust gas concentration data at the initial k value is divided into initial normal data and suspected abnormal data.
And only analyzing the newly added neighborhood data of each piece of initial normal data, wherein the suspected abnormal data does not add the neighborhood data according to the change of the k value, namely the suspected abnormal data does not have a neighborhood, namely the suspected abnormal data is scattered. And adding the suspected abnormal data to the neighborhood of the corresponding initial normal data by updating the k value until no independent suspected abnormal data exists, and stopping updating the initial k value.
Step S3: updating the initial k value to obtain at least two updated k values, obtaining updated outlier factors of each exhaust gas concentration data under each updated k value, and screening out suspected abnormal areas under each updated k value according to distribution conditions of initial normal data and suspected abnormal data in an updated neighborhood of each updated k value of each initial normal data.
Specifically, in the embodiment of the present invention, the initial k value is updated according to the preset step length, the preset step length is set to 1, and the practitioner can set the size of the preset step length according to the actual situation, which is not limited herein. And sequentially increasing the initial k value according to a preset step length, wherein the result of each increase is an updated k value. For example, when the initial k value is 2, then the first update k value obtained by the first update of the initial k value isThe second updated k value is +.>And stopping updating the initial k value until no single suspected abnormal data exists. And obtaining the local outlier factor of each exhaust gas concentration data at each updated k value as an updated outlier factor. And the real abnormal data under each updated k value can be acquired conveniently. In order to efficiently obtain real abnormal data under each updated k value, the embodiment of the invention screens out suspected abnormal areas under each updated k value according to the distribution condition of initial normal data and suspected abnormal data in an updated neighborhood of each updated k value of each initial normal data, analyzes the suspected abnormal areas under each updated k value, and accurately and efficiently determines the real abnormal data.
Preferably, the method for acquiring the suspected abnormal region comprises the following steps: and for any initial normal data, acquiring newly added neighborhood data of the initial normal data in an update neighborhood of each update k value, and when the newly added neighborhood data are all suspected abnormal data, taking the update neighborhood of the initial normal data in the corresponding update k value as a suspected abnormal region.
As an example, taking the first updated k value, i.e., k=3, when k=3, there is at least one newly added neighborhood data in the updated neighborhood of each initial normal data, and when the newly added neighborhood data in the updated neighborhood of the initial normal data is all the initial normal data, it is further determined that the initial normal data in the updated neighborhood of the initial normal data is all the normal exhaust gas concentration data. When newly added neighborhood data in the update neighborhood of the initial normal data only needs to have one suspected abnormal data, the update neighborhood of the corresponding initial normal data under k=3 is used as a suspected abnormal area. To this end, a suspected abnormal region at k=3 is acquired.
And acquiring the suspected abnormal region under each updated k value according to the method for acquiring the suspected abnormal region under k=3.
Step S4: obtaining a normal data set of each suspected abnormal data in the suspected abnormal region under each updated k value according to the updated outlier factor and the position distribution condition of each suspected abnormal data in the suspected abnormal region under each updated k value; and obtaining real abnormal data under each updated k value according to the difference of the initial outlier factors and the difference of the updated outlier factors between the suspected abnormal data under each updated k value and each element in the corresponding normal data set, and determining the optimal updated k value.
Specifically, in order to obtain an optimal updated k value, the embodiment of the invention analyzes suspected abnormal data in a suspected abnormal region under each updated k value, obtains a normal data set of each suspected abnormal data in the suspected abnormal region under each updated k value, and obtains real abnormal data under each updated k value according to the difference of initial outliers and the difference of updated outliers between the suspected abnormal data under each updated k value and each element in the corresponding normal data set, thereby determining the real abnormal data under each updated k value, and accurately determining the k value in an outlier detection algorithm, namely the optimal updated k value. The method for determining the optimal updated k value is as follows:
(1) A normal data set is obtained.
Preferably, the method for acquiring the normal data set is as follows: optionally selecting an updated k value as a target k value, and sequencing each suspected abnormal data in the suspected abnormal region according to the sequence from the large to the small of updating outlier factors for any suspected abnormal region under the target k value to obtain a suspected abnormal data sequence of the suspected abnormal region; regarding the ith suspected abnormal data in the suspected abnormal data sequence, taking a suspected abnormal region containing the ith suspected abnormal data under the target k value as a target region; taking a set formed by the central initial normal data of each target area as the whole normal data set of the ith suspected abnormal data; taking the whole normal data set of each piece of suspected abnormal data positioned before the ith piece of suspected abnormal data in the suspected abnormal data sequence as a comparison set; and removing the repeated initial normal data in the integral normal data set of the ith suspected abnormal data and all the comparison sets to obtain the integral normal data set of the removed ith suspected abnormal data, wherein the integral normal data set of the ith suspected abnormal data is used as the normal data set of the ith suspected abnormal data.
Taking the first updated k value in step S3, i.e. k=3 as an example, taking the first updated k value as a target k value, and for any suspected abnormal region under the target k value, ordering each suspected abnormal data in the suspected abnormal region according to the order of updating outliers from large to small to obtain a suspected abnormal data sequence of the suspected abnormal region; regarding the ith suspected abnormal data in the suspected abnormal data sequence, taking a suspected abnormal region containing the ith suspected abnormal data under k=3 as a target region; taking a set formed by the central initial normal data of each target area as the whole normal data set of the ith suspected abnormal data; according to the method for acquiring the integral normal data set of the ith suspected abnormal data, acquiring the integral normal data set of each element, namely initial normal data, in the suspected abnormal data sequence, and taking the integral normal data set of each suspected abnormal data, which is positioned before the ith suspected abnormal data, in the suspected abnormal data sequence as a comparison set; and removing repeated initial normal points in the integral normal data set of the ith suspected abnormal data and all comparison sets to obtain the integral normal data set of the removed ith suspected abnormal data, wherein the integral normal data set of the ith suspected abnormal data is used as the normal data set of the ith suspected abnormal data. For example, taking a Y-th suspected abnormal region as an example, if two pieces of suspected abnormal data exist in the Y-th suspected abnormal region, namely suspected abnormal data a and suspected abnormal data b, acquiring update outliers of the suspected abnormal data a and the suspected abnormal data b, and sequencing the suspected abnormal data a and the suspected abnormal data b according to the order from the big to the small of the update outliers to obtain a suspected abnormal data sequence of the Y-th suspected abnormal region. If the update outlier factor of the suspected abnormal data a is greater than or equal to the update outlier factor of the suspected abnormal data b, the first element in the obtained suspected abnormal data sequence is the suspected abnormal data a. The suspected abnormal data a and the suspected abnormal data b can be not only newly added neighborhood points of the Y-th suspected abnormal region, but also newly added neighborhood points of other suspected abnormal regions at the same time, so that the suspected abnormal region containing the suspected abnormal data a under k=3 is taken as a target region in the embodiment of the invention; the set formed by the central initial normal data of each target area is used as the whole normal data set of the suspected abnormal data a; and acquiring the whole normal data set of the suspected abnormal data b according to the method for acquiring the whole normal data set of the suspected abnormal data a. Because the update outlier factor of the suspected abnormal data a is greater than or equal to the update outlier factor of the suspected abnormal data b, the more the suspected abnormal data a deviates, the total number of elements in the overall normal data set of the suspected abnormal data a must be less than or equal to the total number of elements in the overall normal data set of the suspected abnormal data b. In order to avoid interference of the suspected abnormal data a on analysis of the suspected abnormal data b, the embodiment of the invention takes the whole normal data set of the suspected abnormal data a as a comparison set, removes the element overlapped with the element in the comparison set, namely the overlapped initial normal point, in the whole normal data set of the suspected abnormal data b, and the whole normal data set of the removed suspected abnormal data b is the normal data set of the suspected abnormal data b. The overall normal data set of the suspected abnormal data a is the normal data set of the suspected abnormal data a. If the elements in the whole normal data set of the suspected abnormal data b are completely overlapped with the elements in the comparison set, the updating outlier factor of the suspected abnormal data a is equal to the updating outlier factor of the suspected abnormal data b, at this time, only the suspected abnormal data a is analyzed, and if the suspected abnormal data a is real abnormal data, the suspected abnormal data b is also real abnormal data; if the suspected abnormal data a is not the real abnormal data, the suspected abnormal data b is not the real abnormal data.
And according to the method for acquiring the normal data set of the suspected abnormal data a and the normal data set of the suspected abnormal data b under k=3, acquiring the normal data set of each suspected abnormal data in the suspected abnormal region under each updated k value.
(2) And acquiring real abnormal data.
Preferably, the acquiring method for acquiring the real abnormal data comprises the following steps: acquiring the difference of initial outlier factors between each suspected abnormal data and each element in the corresponding normal data set under each updated k value as a first difference; wherein each element in the normal data set is initial normal data; acquiring a mean value of the first difference as a first mean value; acquiring differences of updating outlier factors between each suspected abnormal data and each element in the corresponding normal data set under each updated k value as second differences; acquiring a mean value of the second difference as a second mean value; obtaining a difference value between a first mean value and a second mean value of each suspected abnormal data as a reference value of the corresponding suspected abnormal data; when the reference value is larger than a preset reference threshold value, the corresponding suspected abnormal data are real abnormal data.
As an example, taking the suspected abnormal data a in the suspected abnormal region at the first updated k value, i.e., k=3 as an example, the method for determining whether the suspected abnormal data a is true abnormal data is as follows: acquiring an absolute value of a difference value of an initial outlier factor between each element, namely initial normal data, in a normal data set of the suspected abnormal data a and the suspected abnormal data a, namely a first difference, and acquiring a mean value of the first difference, namely a first mean value; and obtaining the absolute value of the difference value of the updated outlier factor between each element in the normal data set of the suspected abnormal data a and the suspected abnormal data a, namely the initial normal data, namely the second difference, and obtaining the mean value of the second difference, namely the second mean value. If the suspected abnormal data a is not an abnormal point, the initial outlier factor of the suspected abnormal data a is very close to the updated outlier factor, and meanwhile, the first average value and the second average value are very close; if the suspected abnormal data a is real abnormal data, the initial outlier factor of the suspected abnormal data a is large, compared with the initial outlier factor of the suspected abnormal data a, the initial outlier factor of the suspected abnormal data a is small, the difference between the initial outlier factor of the suspected abnormal data a and the updated outlier factor is large, and meanwhile, the difference between the first mean value and the second mean value is also large. Therefore, a difference value between the first mean value and the second mean value of the suspected abnormal data a is obtained and used as a reference value of the suspected abnormal data a. When the reference value of the suspected abnormal data a is larger, the difference between the first average value and the second average value of the suspected abnormal data a is larger, the first average value is larger than the second average value, and the suspected abnormal data a is more likely to be an abnormal point. Wherein the first average value is greater than or equal to the second average value. In the embodiment of the invention, the preset reference threshold is set to be 0.1, and the operator can set the reference threshold according to the actual situation, which is not limited herein. When the reference value of the suspected abnormal data a is larger than a preset reference threshold value, the suspected abnormal data a is real abnormal data. When the reference value of the suspected abnormal data a is smaller than or equal to a preset reference threshold value, the suspected abnormal data a is normal exhaust gas concentration data.
According to the method for determining whether the suspected abnormal data a is real abnormal data, each suspected abnormal data in each suspected defect area under k=3 is analyzed, whether each suspected abnormal data is real abnormal data is determined, and then the number of the real abnormal data under k=3 is determined.
According to the method of acquiring the number of real abnormal data at k=3, the number of real abnormal data at each updated k value is acquired.
(3) And obtaining an optimal updated k value.
Specifically, the more the number of real abnormal data is, the more accurate the exhaust gas concentration data identifying the abnormality is under the corresponding updated k value, so that the embodiment of the invention takes the updated k value corresponding to the maximum number of real abnormal data as the optimal k value of the outlier detection algorithm. And if the number of the largest real abnormal data corresponds to at least two updated k values, selecting the largest updated k value as the optimal updated k value of the outlier detection algorithm.
Step S5: and acquiring abnormal exhaust gas concentration data according to the optimal updated k value.
Specifically, based on the optimal updated k value, acquiring a local outlier factor of each exhaust gas concentration data through an outlier detection algorithm, and taking the local outlier factor as an optimal outlier factor; acquiring the average value of the optimal outlier factors as an optimal average value; and marking the exhaust gas concentration data corresponding to the optimal outlier factor larger than the optimal average value as abnormal exhaust gas concentration data. So far, the abnormal exhaust gas concentration data at the current moment is accurately monitored. The method ensures that workers can timely process abnormal waste gas concentration data, and avoids the damage of the abnormal waste gas concentration to the environment and human health.
The present invention has been completed.
In summary, the embodiment of the present invention obtains the exhaust gas concentration data; setting an initial k value of an outlier detection algorithm, acquiring an initial outlier factor of each exhaust gas concentration data, and determining initial normal data and suspected abnormal data; updating the initial k value, acquiring real abnormal data under each updated k value according to the distribution condition of the initial normal data and the suspected abnormal data in the update neighborhood under each updated k value, and determining the optimal updated k value; and acquiring abnormal exhaust gas concentration data according to the optimal updated k value. According to the method, the optimal k value of the outlier detection algorithm is obtained through the number of the real outliers under each updated k value, so that the concentration of the abnormal waste gas is accurately monitored, and the damage caused by the waste gas is timely prevented.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims (10)

1. The method for collecting and processing the environmental protection monitoring data of the hard carbon process is characterized by comprising the following steps of:
acquiring exhaust gas concentration data at different moments in a set time period;
setting an initial k value of an outlier detection algorithm, and acquiring a local outlier factor of each exhaust gas concentration data under the initial k value as an initial outlier factor; performing anomaly detection on the exhaust gas concentration data according to the initial outlier factors, and determining initial normal data and suspected abnormal data according to the abnormal distribution condition of the data in the anomaly detection results;
updating the initial k value to obtain at least two updated k values, obtaining updated outlier factors of each waste gas concentration data under each updated k value, and screening out suspected abnormal areas under each updated k value according to the distribution condition of the initial normal data and the suspected abnormal data in the updated neighborhood of each updated k value of each initial normal data;
obtaining a normal data set of each suspected abnormal data in the suspected abnormal region under each updated k value according to the updated outlier factor and the position distribution condition of each suspected abnormal data in the suspected abnormal region under each updated k value; obtaining real abnormal data under each updated k value according to the difference of the initial outlier factors and the difference of the updated outlier factors between the suspected abnormal data under each updated k value and each element in the corresponding normal data set, and determining an optimal updated k value;
and acquiring abnormal exhaust gas concentration data according to the optimal updated k value.
2. The method for collecting and processing environmental monitoring data of hard carbon process according to claim 1, wherein the method for detecting the abnormality of the exhaust gas concentration data according to the initial outlier factor comprises the following steps:
acquiring the average value of the initial outlier factors as an initial target average value;
when the initial outlier factor is larger than or equal to the initial target mean value, the corresponding exhaust gas concentration data is used as first abnormal data;
when the initial outlier factor is smaller than the initial target mean, the corresponding exhaust gas concentration data is used as first normal data.
3. The method for collecting and processing environmental protection monitoring data of hard carbon process according to claim 2, wherein the method for determining initial normal data and suspected abnormal data according to abnormal distribution conditions of data in abnormal detection results is as follows:
acquiring an initial neighborhood of each first normal data under an initial k value as an initial normal neighborhood;
when each neighborhood data in the initial normal neighborhood is first normal data, taking the central first normal data of the initial normal neighborhood as initial normal data;
when at least one first abnormal data exists in each neighborhood data in the initial normal neighborhood, taking the central first normal data of the initial normal neighborhood as initial abnormal data; wherein the first abnormal data is also used as initial abnormal data.
4. The method for acquiring and processing the environmental protection monitoring data of the hard carbon process according to claim 1, wherein the method for acquiring the suspected abnormal region is as follows:
and for any initial normal data, acquiring newly added neighborhood data of the initial normal data in an update neighborhood of each update k value, and when at least one of the newly added neighborhood data is suspected abnormal data, taking the update neighborhood of the initial normal data in the corresponding update k value as a suspected abnormal region.
5. The method for acquiring and processing the environmental protection monitoring data of the hard carbon process according to the update outlier factor and the position distribution condition of each piece of suspected abnormal data in the suspected abnormal area under each update k value is characterized by comprising the following steps:
optionally selecting an updated k value as a target k value, and sequencing each suspected abnormal data in the suspected abnormal region according to the sequence from the large to the small of updating outlier factors for any suspected abnormal region under the target k value to obtain a suspected abnormal data sequence of the suspected abnormal region;
regarding the ith suspected abnormal data in the suspected abnormal data sequence, taking a suspected abnormal region containing the ith suspected abnormal data under the target k value as a target region;
taking a set formed by the central initial normal data of each target area as the whole normal data set of the ith suspected abnormal data;
taking the whole normal data set of each piece of suspected abnormal data positioned before the ith piece of suspected abnormal data in the suspected abnormal data sequence as a comparison set;
and removing the repeated initial normal data in the integral normal data set of the ith suspected abnormal data and all the comparison sets to obtain the integral normal data set of the removed ith suspected abnormal data, wherein the integral normal data set of the ith suspected abnormal data is used as the normal data set of the ith suspected abnormal data.
6. The method for acquiring and processing the environmental protection monitoring data of the hard carbon process according to claim 5, wherein the method for acquiring the real abnormal data is as follows:
acquiring the difference of initial outlier factors between each suspected abnormal data and each element in the corresponding normal data set under each updated k value as a first difference; wherein each element in the normal data set is initial normal data;
acquiring the average value of the first difference as a first average value;
acquiring differences of updating outlier factors between each suspected abnormal data and each element in the corresponding normal data set under each updated k value as second differences;
acquiring the average value of the second difference as a second average value;
obtaining a difference value between a first mean value and a second mean value of each suspected abnormal data as a reference value of the corresponding suspected abnormal data;
when the reference value is larger than a preset reference threshold value, the corresponding suspected abnormal data are real abnormal data.
7. The method for acquiring and processing the environmental protection monitoring data of the hard carbon process according to claim 1, wherein the method for acquiring the optimal updated k value is as follows:
acquiring the number of real abnormal data under each updated k value as the abnormal number;
and taking the updated k value corresponding to the maximum abnormal number as the optimal updated k value.
8. The method for acquiring and processing environmental protection monitoring data of hard carbon process according to claim 1, wherein the method for acquiring abnormal exhaust gas concentration data according to the optimal updated k value is as follows:
based on the optimal updated k value, acquiring a local outlier factor of each exhaust gas concentration data by an outlier detection algorithm as an optimal outlier factor;
acquiring the average value of the optimal outlier factors as an optimal average value;
and marking the exhaust gas concentration data corresponding to the optimal outlier factor larger than the optimal average value as abnormal exhaust gas concentration data.
9. The method for collecting and processing environmental protection monitoring data of hard carbon process as claimed in claim 1, wherein the process of updating the initial k value is as follows: and sequentially increasing the initial k value according to a preset step length, wherein the result of each increase is an updated k value.
10. The method for acquiring and processing the environmental protection monitoring data of the hard carbon process according to claim 1, wherein the method for acquiring the updated outlier factor is as follows:
and obtaining the local outlier factor of each exhaust gas concentration data at each updated k value as an updated outlier factor.
CN202311253322.2A 2023-09-27 2023-09-27 Hard carbon process environment-friendly monitoring data acquisition and processing method Active CN116992391B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311253322.2A CN116992391B (en) 2023-09-27 2023-09-27 Hard carbon process environment-friendly monitoring data acquisition and processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311253322.2A CN116992391B (en) 2023-09-27 2023-09-27 Hard carbon process environment-friendly monitoring data acquisition and processing method

Publications (2)

Publication Number Publication Date
CN116992391A CN116992391A (en) 2023-11-03
CN116992391B true CN116992391B (en) 2023-12-15

Family

ID=88525202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311253322.2A Active CN116992391B (en) 2023-09-27 2023-09-27 Hard carbon process environment-friendly monitoring data acquisition and processing method

Country Status (1)

Country Link
CN (1) CN116992391B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117314020B (en) * 2023-11-28 2024-02-27 生态环境部华南环境科学研究所(生态环境部生态环境应急研究所) Wetland carbon sink data monitoring system of plankton

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11257634A (en) * 1998-03-10 1999-09-21 Sumitomo Heavy Ind Ltd Operation-supporting device for combustion controller in waste incinerating furnace
CN108921440A (en) * 2018-07-11 2018-11-30 平安科技(深圳)有限公司 Pollutant method for monitoring abnormality, system, computer equipment and storage medium
CN112285287A (en) * 2020-10-19 2021-01-29 广东长天思源环保科技股份有限公司 Flue gas online monitoring data preprocessing method
WO2021143337A1 (en) * 2020-01-17 2021-07-22 深圳前海微众银行股份有限公司 Data processing method, apparatus, and device, and computer readable storage medium
CN115858630A (en) * 2023-02-21 2023-03-28 新风光电子科技股份有限公司 Abnormity detection method for energy storage data of energy storage power station
CN116805065A (en) * 2023-08-25 2023-09-26 山东荣信集团有限公司 Intelligent management method for monitoring data of coal powder heating furnace burner

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11257634A (en) * 1998-03-10 1999-09-21 Sumitomo Heavy Ind Ltd Operation-supporting device for combustion controller in waste incinerating furnace
CN108921440A (en) * 2018-07-11 2018-11-30 平安科技(深圳)有限公司 Pollutant method for monitoring abnormality, system, computer equipment and storage medium
WO2021143337A1 (en) * 2020-01-17 2021-07-22 深圳前海微众银行股份有限公司 Data processing method, apparatus, and device, and computer readable storage medium
CN112285287A (en) * 2020-10-19 2021-01-29 广东长天思源环保科技股份有限公司 Flue gas online monitoring data preprocessing method
CN115858630A (en) * 2023-02-21 2023-03-28 新风光电子科技股份有限公司 Abnormity detection method for energy storage data of energy storage power station
CN116805065A (en) * 2023-08-25 2023-09-26 山东荣信集团有限公司 Intelligent management method for monitoring data of coal powder heating furnace burner

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于改进K均值聚类的异常检测算法;左进;陈泽茂;;计算机科学(第08期);全文 *
浅析化工园区大气特征因子自动监测综合分析;余益军;戴玄吏;李春玉;孙佳;蔡焕兴;;环境科学与管理(第02期);全文 *

Also Published As

Publication number Publication date
CN116992391A (en) 2023-11-03

Similar Documents

Publication Publication Date Title
CN116992391B (en) Hard carbon process environment-friendly monitoring data acquisition and processing method
Jablonski et al. Modeling of probability distribution functions for automatic threshold calculation in condition monitoring systems
JP5091604B2 (en) Distribution evaluation method, product manufacturing method, distribution evaluation program, and distribution evaluation system
CN112179691B (en) Mechanical equipment running state abnormity detection system and method based on counterstudy strategy
CN111275307A (en) Quality control method for high-frequency continuous observation data of automatic online water quality station
CN111949941B (en) Equipment fault detection method, system, device and storage medium
CN110995153B (en) Abnormal data detection method and device for photovoltaic power station and electronic equipment
CN115166180A (en) Landfill leachate water quality analysis system and method
CN115982602A (en) Photovoltaic transformer electrical fault detection method
CN112612824A (en) Water supply pipe network abnormal data detection method based on big data
CN108508860B (en) Process industrial production system data monitoring method based on coupling relation
CN111176226A (en) Automatic analysis method for alarm threshold of equipment characteristic parameter based on operation condition
CN103366119B (en) The monitoring method and device of virus trend anomaly
CN107506832B (en) Hidden danger mining method for assisting monitoring tour
CN113592308A (en) Monitoring data alarm threshold extraction method based on normal model
CN113627885A (en) Power grid power supply abnormity monitoring system and monitoring method thereof
CN116168019A (en) Power grid fault detection method and system based on machine vision technology
CN108229586B (en) The detection method and system of a kind of exceptional data point in data
CN115660288A (en) Analysis management system based on internet big data
CN112114578B (en) Steady method for multi-process multivariable process online monitoring and abnormal source diagnosis
CN110083804B (en) Wind power plant SCADA data missing intelligent repairing method based on condition distribution regression
CN113746862A (en) Abnormal flow detection method, device and equipment based on machine learning
CN109189775B (en) Industrial monitoring platform mass data processing system and method
CN107515596B (en) Statistical process control method based on image data variable window defect monitoring
CN112347094A (en) Industrial general equipment Internet of things data cleaning method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant