CN116304573A - Abnormal data cleaning method and device, electronic equipment and storage medium - Google Patents
Abnormal data cleaning method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116304573A CN116304573A CN202310313337.7A CN202310313337A CN116304573A CN 116304573 A CN116304573 A CN 116304573A CN 202310313337 A CN202310313337 A CN 202310313337A CN 116304573 A CN116304573 A CN 116304573A
- Authority
- CN
- China
- Prior art keywords
- data
- power curve
- wind power
- image
- processed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 192
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000004140 cleaning Methods 0.000 title claims abstract description 36
- 238000001914 filtration Methods 0.000 claims abstract description 57
- 238000012545 processing Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 16
- 238000003708 edge detection Methods 0.000 claims description 13
- 230000009466 transformation Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 8
- 238000013507 mapping Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 8
- 238000004891 communication Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000009826 distribution Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000010248 power generation Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000002547 anomalous effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E10/00—Energy generation through renewable energy sources
- Y02E10/70—Wind energy
- Y02E10/72—Wind turbines with rotation axis in wind direction
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Wind Motors (AREA)
Abstract
The invention discloses an abnormal data cleaning method, an abnormal data cleaning device, electronic equipment and a storage medium. Wherein the method comprises the following steps: acquiring original operation data of a wind turbine generator, and generating an original wind power curve image based on the original operation data; marking and filtering first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the wind power curve image to obtain operation data to be processed; generating a wind power curve image to be processed based on the operation data to be processed, and marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed to obtain target operation data corresponding to the wind turbine generator. According to the technical scheme, the effect of accurately detecting and marking different types of abnormal data and guaranteeing the smoothness of the result wind power curve to the greatest extent is achieved.
Description
Technical Field
The present invention relates to the field of power engineering technologies, and in particular, to a method and apparatus for cleaning abnormal data, an electronic device, and a storage medium.
Background
Wind energy is regarded as a renewable energy source, and recently is paid attention to and widely used in various countries around the world. Different from the traditional energy power generation, the wind energy is utilized to generate power, so that the environmental pollution can be reduced, and the sustainable development strategy is met. A data acquisition and supervisory control (Supervisory Control And Data Acquisition, SCADA) system acquires data from the wind turbine, the data indicating an operational condition of the wind turbine. However, the various types of anomaly data contained in SCADA systems present challenges to the operation and maintenance of wind turbines. Therefore, detection and cleaning of abnormal data of the wind turbine generator are critical to operation of the wind power plant.
Currently, existing abnormal data detection and cleaning methods generally comprise a data statistics method and a wind power curve modeling method. Statistical methods are based on differences in statistical characteristics of the abnormal data and the normal data, including clustering characteristics and distributions. The WPC modeling method utilizes WPC features extracted from a large amount of SCADA data to clean up abnormal data, and can be divided into a numerical data-based method and an image-based method.
However, statistical methods are acceptable in processing scattered data, but produce large errors in processing stacked anomalous data, primarily by ignoring portions of the anomalous data and mistaking normal data. The anomaly data recognition effect of WPC modeling methods relies on reference images and a priori knowledge.
Disclosure of Invention
The invention provides an abnormal data cleaning method, an abnormal data cleaning device, electronic equipment and a storage medium, which can accurately detect and mark different types of abnormal data, can maximally ensure the smoothness of a result wind power curve, and improve the cleaning efficiency and accuracy of the abnormal data.
According to an aspect of the present invention, there is provided an abnormal data cleaning method, the method comprising:
acquiring original operation data of a wind turbine generator; generating an original wind power curve image based on the original operation data;
marking and filtering first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain operation data to be processed;
generating a wind power curve image to be processed based on the operation data to be processed, and marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed to obtain target operation data corresponding to the wind turbine generator.
According to another aspect of the present invention, there is provided an abnormal data cleaning apparatus comprising:
The data acquisition module is used for acquiring the original operation data of the wind turbine generator; generating an original wind power curve image based on the original operation data;
the data filtering module is used for marking and filtering first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain operation data to be processed;
the target operation data determining module is used for generating a wind power curve image to be processed based on the operation data to be processed, marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed, and obtaining target operation data corresponding to the wind turbine generator.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the abnormal data cleaning method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute the method for cleaning abnormal data according to any one of the embodiments of the present invention.
According to the technical scheme, the original operation data of the wind turbine generator are obtained; and generating an original wind power curve image based on the original operation data, further, marking and filtering first-type abnormal data and/or second-type abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain to-be-processed operation data, finally, generating a to-be-processed wind power curve image based on the to-be-processed operation data, marking and filtering third-type abnormal data included in the to-be-processed operation data based on the to-be-processed wind power curve image to obtain target operation data corresponding to the wind turbine generator, solving the problems that in the prior art, when stacked abnormal data are processed, larger errors are generated, part of abnormal data are ignored, normal data are deleted mistakenly, reference images and priori knowledge are depended on, and the like, realizing the effects of accurately detecting and marking different types of abnormal data, and maximally guaranteeing smoothness of the resulting wind power curve, improving the cleaning efficiency and the accuracy of the abnormal data, and reducing the requirement on the prior knowledge on the premise of not depending on the reference image, and realizing the effect of distinguishing various types of abnormal data.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an abnormal data cleaning method according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of an original wind power curve image according to a first embodiment of the present invention;
fig. 3 is a schematic structural diagram of an abnormal data cleaning apparatus according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device implementing the method for cleaning abnormal data according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of an abnormal data cleaning method provided in an embodiment of the present invention, where the embodiment may be suitable for a case of identifying and filtering abnormal operation data of different types in operation data corresponding to a wind turbine, the method may be performed by an abnormal data cleaning device, the abnormal data cleaning device may be implemented in a form of hardware and/or software, and the abnormal data cleaning device may be configured in a terminal and/or a server. As shown in fig. 1, the method includes:
S110, acquiring original operation data of a wind turbine generator; and generating an original wind power curve image based on the original operation data.
The wind turbine generator refers to a system or equipment for converting kinetic energy of wind into electric energy, and comprises a wind wheel and a generator. The original operation data can be data representing the operation condition of the wind turbine generator. The raw operational data may be used to characterize numerical characteristics between various data produced by the wind turbine during operation. The original operation data can comprise data of data characteristics such as wind speed, yaw angle, rotating speed, yaw wind error, wind direction, wind power and the like. The original wind power curve image may be a three-dimensional image that maps the original operational data into a three-dimensional image space for characterizing a distribution characteristic between wind speed and wind power. The color of each pixel point in the original wind power curve image can represent the frequency of occurrence of data in the original running data.
In practical application, the original operation data may include normal data and/or abnormal data, when the wind turbine generator is in normal operation, the normal data should be distributed in the wind speed-wind power main curve, however, multiple factors such as sensor faults or artificial instructions or wind limiting commands corresponding to any area may cause abnormal operation data of the wind turbine generator, so in practical application, in order to accurately analyze the operation of the wind turbine generator, the original operation data may be cleaned to obtain final target operation data.
It should be noted that, the original operation data may be obtained from a wind turbine generator operation data acquisition system, may be obtained from a local database, or may be obtained through other data acquisition manners, which is not limited in this embodiment. For example, raw operational data of the wind turbines may be obtained by a data acquisition and supervisory control (Supervisory Control And Data Acquisition, SCADA) system.
In practical application, after the original operation data of the wind turbine generator is obtained, the original operation data can be mapped to a three-dimensional image space, a three-dimensional image with the horizontal coordinate being wind speed, the vertical coordinate being wind power and the vertical coordinate being the frequency of data points in the original operation data is established according to the functional relation between wind speed and wind power, and the three-dimensional image can be used as an original wind power curve image. Fig. 2 is a schematic diagram of an original wind power curve image.
S120, marking and filtering first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain operation data to be processed.
In this embodiment, the first type of abnormal data may be shutdown abnormal data, that is, data generated during a period when the wind turbine is shutdown due to maintenance, failure, and the like. In general, the first type of abnormal data is generally overlapped and distributed at the bottom of a wind power curve, Wind power data is concentrated on the 0 attachment. For example, the first type of anomaly data may be anomaly data for which wind power is close to 0 or negative when the wind speed is greater than the cut-in wind speed; when the wind turbine generator absorbs power from the power grid, the value of the wind power will be negative or close to 0, and the first abnormal data can be expressed as { (v, p) |v>v cut-in &p is less than or equal to 0}, wherein v represents wind speed, p represents wind power, v cut-in Indicating the cut-in wind speed.
In this embodiment, the second type of abnormal data may be discrete abnormal data, which exhibits a certain randomness in numerical value, and is specifically expressed as deviating from the wind power curve. In the original wind power curve image, the second type of abnormal data are scattered outside the wind power curve main body, and meanwhile, the frequency of each data point in the second type of abnormal data is obviously lower than that of each data point in the normal operation data in the data frequency dimension. In general, the second type of abnormal data is data generated due to a sensor failure, a communication failure, or the like.
In practical application, in order to accurately analyze the operation condition of the wind turbine generator, after the original operation data are obtained, the abnormal data included in the original operation data can be cleaned and filtered, specifically, the first type abnormal data and/or the second type abnormal data included in the original operation data can be determined according to the numerical characteristics of the original operation data and the distribution characteristics of the corresponding original wind power curve image, and further, the abnormal data are marked and filtered to obtain the filtered original operation data, and the filtered original operation data can be used as the operation data to be processed.
In practical application, the original operation data may include only the first type of abnormal data or only the second type of abnormal data, or may include both the first type of abnormal data and the second type of abnormal data, and if the original operation data includes both the first type of abnormal data and the second type of abnormal data, the data filtering step sequence of the two types of abnormal data is not specifically limited, that is, the first type of abnormal data may be marked and filtered first, and then the second type of abnormal data may be marked and filtered according to the filtered original operation data; the second type of abnormal data can be marked and filtered firstly, and then the first type of abnormal data is marked and filtered according to the filtered original operation data; the first type of abnormal data and the second type of abnormal data can be marked and filtered at the same time, so that the operation data to be processed is obtained.
In practical applications, different types of abnormal data may correspond to different abnormal data cleaning modes, and the data cleaning modes of the first type of abnormal data and the second type of abnormal data may be described below.
Optionally, for the first type of abnormal data, marking and filtering the first type of abnormal data included in the original operation data based on the original operation data and the original wind power curve image, including:
Marking and filtering the first type of abnormal data included in the original operation data based on the original operation data and a predetermined first type of abnormal data marking condition.
In this embodiment, the first-type abnormal data marking condition may be an abnormal data marking condition determined according to a numerical characteristic of the first-type abnormal data. For example, the first type of abnormal data marking condition may be that a wind speed is greater than a cut-in wind speed of the wind turbine generator, and wind power is less than or equal to 0.
In practical application, after the original operation data are obtained, the values of wind speed and wind power of each group in the original operation data can be compared with the predetermined first type of abnormal data marking conditions, each group of operation data meeting the first type of abnormal data marking conditions is screened out, the operation data are used as first type of abnormal data, the first type of abnormal data are marked, the first type of abnormal data are filtered from the original operation data, and meanwhile, the first type of abnormal data can be removed from the original wind power curve image.
Optionally, for the second type of abnormal data, marking and filtering the second type of abnormal data included in the original operation data based on the original operation data and the original wind power curve image, including: performing color space transformation on the original wind power curve image to obtain an image to be processed; and marking and filtering second-class abnormal data included in the original operation data based on the image to be processed.
The color space transformation may be an image processing method, that is, a method of converting an image from a currently-belonging color space to another color space for representation. It should be understood by those skilled in the art that colors can be generally described by three independent attributes, and that a space coordinate is formed by representing an independent attribute by different variables, where the space corresponding to the space coordinate is a color space, and optionally, the color space may include an RGB color space, a YUV color space, an HSV color space, an HIS color space, and so on.
Typically, an image is made up of three channels, red (R), green (G) and blue (B), each channel consisting of different pixel values in the interval (0-255), so that a color image can be constructed. In practical application, the original wind power curve image is an RGB image composed of pixel values of R, G, B three channels, and the RGB image is an image in which hue, brightness and saturation are expressed together and is difficult to analyze according to one of the features, so that the color space of the original wind power curve image can be transformed to convert the original wind power curve image from the RGB image to other color space images. In this embodiment, the original power curve image may be converted to an image of the HSV color space, which is represented based on the pixel values of the H, S, V three channels. Among them, HSV is a way to represent pixels in RGB color space in an inverted cone. HSV, hue (Hue), saturation (Saturation), value, may also be referred to as HSB (Brightness). Hue is a basic attribute of color, i.e., the name of a color, such as red, blue, yellow, or the like; the saturation refers to the purity of the color, the higher the saturation is, the purer the color is, the lower the saturation is, the gradually changed gray of the color is represented, and the value range can be 0-100%. Brightness refers to the brightness of a color, and can range from 0 to max. HSV color space can be described by a conical space model where v=0, h and S are undefined at the apex of the cone, representing black, v=max, s=0, h are undefined at the center of the top surface of the cone, representing white.
In practical application, after an original wind power curve image is obtained, color space transformation processing can be performed on the image so as to convert the image from an RGB color space to an HSV color space, and further, a corresponding HSV image is obtained, and the HSV image can be used as an image to be processed.
For example, the process of converting an original wind power curve image from the RGB color space to the HSV color space may be expressed based on the following formula:
V=C max
wherein C is max =max{R,G,B},C min =min{R,G,B}。
Further, after the image to be processed is obtained, the image to be processed may be analyzed, so as to determine the second type of abnormal data included in the original operation data based on the pixel point numerical characteristics represented in the image to be processed and the numerical characteristics corresponding to the second type of abnormal data, so that the second type of abnormal data may be labeled and filtered.
Optionally, marking and filtering the second type of abnormal data included in the original operation data based on the image to be processed includes: comparing the pixel value of each pixel point in the image to be processed with a preset pixel value interval to determine at least one target pixel point; and mapping each target pixel point into an original wind power curve image, marking, and taking marked original operation data as second-class abnormal data and filtering.
In this embodiment, the pixel value of each pixel point in the image to be processed may be a numerical value based on the H, S, V three-channel configuration, i.e., an HSV value. The preset pixel value interval may be a preset value interval for screening pixel points corresponding to the abnormal data of the corresponding type. In practical application, the pixel value of each pixel point in the image to be processed is a numerical value formed by three channels, so that the preset pixel value interval is also formed by the value intervals corresponding to the three channels. For example, the preset pixel value interval may be H ε [100,124], S ε [43,255], V ε [46,255].
In practical application, for each pixel value of the image to be processed, the pixel value of the pixel point can be compared with a preset pixel value interval, if the pixel value of the pixel point belongs to the preset pixel value interval, that is, the pixel value of the pixel point is within the preset pixel value interval, the pixel point can be screened out and used as a target pixel point, further, after the pixel value of each pixel point in the image to be processed is compared with the preset pixel value interval, and at least one target pixel point is screened out, each target pixel point can be mapped into an original wind power curve image, so that the position of the corresponding target pixel point is marked in the original wind power curve image, the marked original operation data can be used as second type abnormal data and filtered, and meanwhile, the second type abnormal data can be removed from the original wind power curve.
Illustratively, assume that the pixel value of a pixel point in the image to be processed is HSV i =(H i ,S i ,V i ) The preset pixel value interval is H epsilon [100,124 ]],S∈[43,255],V∈[46,255]HSV is taken i Comparing with a preset pixel value interval, if HSV i And if the pixel belongs to the preset pixel value interval, taking the pixel as a target pixel.
S130, generating a wind power curve image to be processed based on the operation data to be processed, and marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed to obtain target operation data corresponding to the wind turbine generator.
In this embodiment, the wind power curve image to be processed may be a three-dimensional image in which the operation data to be processed is mapped into a three-dimensional image space for characterizing a distribution characteristic between wind speed and wind power. The color of each pixel point in the wind power curve image to be processed can represent the frequency of occurrence of data in the running data to be processed. The third type of anomaly data may be stacked anomaly data that generally occurs over a continuous period of time and is aligned in parallel on the wind power curve, which is easily recognized by the system as normal data when the number of stacked data points is large. The third type of abnormal data is usually generated by wind power generation of the wind power generation set being lower than rated power due to a limit instruction or grid connection limitation of the wind power generation set. Because the instruction can last for a certain time, the abnormal data can appear in a continuous period of time, appear as a certain frequency in the original operation data, and are overlapped into a horizontal straight line on the wind power curve.
In practical application, after obtaining the operation data to be processed which does not include the first type of abnormal data and the second type of abnormal data, mapping the operation data to be processed into a three-dimensional image space, establishing a three-dimensional image with an abscissa of wind speed, an ordinate of wind power and an ordinate of data point frequency in the operation data to be processed according to a functional relation between wind speed and wind power, taking the three-dimensional image as a wind power curve image to be processed, further, processing the wind power curve image to be processed according to numerical characteristics of the third type of abnormal data and distribution characteristics of the third type of abnormal data on the image based on a corresponding algorithm, thereby determining the third type of abnormal data included in the operation data to be processed, and marking and filtering the third type of abnormal data.
Optionally, marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed, including: processing the wind power curve image to be processed according to an edge detection algorithm to obtain a wind power curve mask image; determining a data positioning range of third type of abnormal data in the power curve image to be processed according to the wind power curve mask image; and determining third abnormal data based on the data positioning range, and marking and filtering.
The edge detection algorithm may be an algorithm for extracting a boundary line between an object and a background in an image. Alternatively, the edge detection algorithm may include a Robert algorithm, a Sobel algorithm, a laplace algorithm, a Canny algorithm, and the like. In this embodiment, the edge detection algorithm may be any algorithm capable of implementing edge detection, and preferably, a Canny algorithm may be used. It should be noted that, the Canny algorithm is adopted as the edge detection algorithm, which has the following advantages: the edge detection error rate is low, the edge point positioning is accurate, and the image noise can be filtered and the edge characteristic can be maintained.
The wind power curve mask image may be an image representing an approximate outline of a wind power curve in the wind power curve image to be processed. It should be understood by those skilled in the art that the mask image is a binary image composed of pixel values of 0 and 1, and in the field of image processing, the mask image may be used to extract a region of interest, adjust the pixel value of the region of interest to 1, and adjust the pixel value of the image outside the region of interest to 0, so as to obtain a mask image obtained by segmenting the region of interest in the image.
In practical application, after the wind power curve image to be processed is obtained, the wind power curve image to be processed can be processed based on an edge detection algorithm so as to extract the edge of the wind power curve in the image, and further, a wind power mask image representing the outline of the wind power curve can be obtained.
For example, taking an edge detection algorithm as a Canny algorithm as an example, in order to improve the accuracy of edge recognition, in this implementation, the edge detection operator selects a second-order sobel operator, and the processing procedure of the wind power curve image to be processed based on the edge detection algorithm can be expressed based on the following formula:
M x [i,j]=S x *I(i,j),M y [i,j]=S y *I(i,j)
ψ[i,j]=arctan(M x [i,j] 2 /M y [i,j] 2 )
wherein, subscripts x and y respectively represent x-axis direction and y-axis direction, s_represents a sobel operator, x represents convolution, [ i, j ] represents a corresponding pixel value at pixel [ i, j ], and m_represents gradient magnitude.
Further, the wind power curve mask image is processed, linear characteristics represented in the wind power curve mask image are detected based on a corresponding algorithm according to the distribution characteristics of the third type of abnormal data in the image, and therefore the data positioning range of the third type of abnormal data in the power curve to be processed is determined.
Optionally, determining a data positioning range of the third type of abnormal data in the power curve image to be processed according to the wind power curve mask image includes: processing the wind power curve mask image based on Hough transformation, and determining a target horizontal straight line included in the wind power curve mask image; determining a power curve boundary in the wind power curve mask image according to the wind power curve mask image and a pre-constructed edge polynomial function; and determining the data positioning range of the third type of abnormal data in the to-be-processed power curve image based on the target horizontal straight line and the power curve boundary.
The Hough transform (Hough) is a method for detecting the boundary shape of a break point, and it implements the fitting of a straight line and a curve by transforming the image coordinate space into a parameter space. The Hough transformation is realized by mapping an image in a Cartesian coordinate system into a polar coordinate system, discretizing the polar coordinate system, and converting an image tracking problem into a statistical problem. In hough transform, each pixel point in the cartesian coordinate system may be mapped to a straight line in the polar coordinate system.
For example, for any one pixel point in the mask image of the wind power curve, the conversion process of the hough conversion can be expressed based on the following formula:
ρ=xcosθ+ysinθ
wherein, (x, y) is the coordinate value of the pixel point in the Cartesian coordinate system, and (ρ, θ) is the corresponding curve coordinate in the polar coordinate system.
Further, after hough transformation is performed on all pixel points in the mask image of the wind power curve, according to the horizontal distribution characteristics of the third type of abnormal data, all horizontal straight lines with θ value of 0 are extracted, and the extracted horizontal straight lines can be used as target horizontal straight lines.
In practical application, in order to obtain an edge curve of a wind power curve, a clear limit of a data area of third type of abnormal data is positioned, so that error filtering of normal operation data during filtering of the third type of abnormal data is avoided, when a wind power curve mask image is obtained, the wind power curve mask image can be processed according to a pre-constructed edge polynomial function, and therefore a power curve boundary in the wind power curve mask image is determined.
Illustratively, the construction process of the edge polynomial function may be: selecting a plurality of edge pixel points (x, y) from a wind power curve mask image, establishing a piecewise once interpolation polynomial, and obtaining two interpolation nodes (x n-1 ,y n-1 ) And (x) n ,y n ) The edge polynomial function between can be expressed as:
further, the data positioning range of the third type of abnormal data in the to-be-processed power curve image can be determined according to the target horizontal straight line and the power curve boundary. The data positioning range may be a data area of the third type of abnormal data in the to-be-processed power curve image. Specifically, an area surrounded by the target horizontal straight line and the power curve boundary can be used as a data positioning range of the third type of abnormal data in the to-be-processed power curve image.
And finally, determining third-class abnormal data included in the operation data to be processed according to the mapping rule of the data points to the image, marking and filtering the third-class abnormal data, taking the filtered data obtained at the moment as target operation data, and simultaneously, clearing a data positioning range corresponding to the third-class abnormal data in the power curve image to be processed.
In practical application, the target operation data of the wind turbine can be applied to various scenes for analyzing the operation condition of the wind turbine, so that after the target application data of the wind turbine are obtained, the target application data can be processed according to different application requirements.
Optionally, on the basis of the above technical solutions, the method further includes: and storing the target operation data to predict the wind power of the wind turbine generator within a preset time period after the current moment based on the target operation data.
In practical application, after the target operation data of the wind turbine generator is obtained, the target operation data of the wind turbine generator can be stored in a corresponding storage space to be used as historical operation data of the wind turbine generator, if wind power of the wind turbine generator in a preset time period after the current moment is to be predicted, so that scheduling of the power system can be reasonably planned for fully utilizing electric energy, a training sample data set can be constructed based on the prestored historical operation data, further, a model to be trained is trained according to the training sample data set and label data corresponding to the training sample data set, and a wind power prediction model is obtained, so that wind power of the wind turbine generator in the preset time period after the current moment can be predicted according to the wind power prediction model.
According to the technical scheme, the original operation data of the wind turbine generator are obtained; and generating an original wind power curve image based on the original operation data, further, marking and filtering first-type abnormal data and/or second-type abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain to-be-processed operation data, finally, generating a to-be-processed wind power curve image based on the to-be-processed operation data, marking and filtering third-type abnormal data included in the to-be-processed operation data based on the to-be-processed wind power curve image to obtain target operation data corresponding to the wind turbine generator, solving the problems that in the prior art, when stacked abnormal data are processed, larger errors are generated, part of abnormal data are ignored, normal data are deleted mistakenly, reference images and priori knowledge are depended on, and the like, realizing the effects of accurately detecting and marking different types of abnormal data, and maximally guaranteeing smoothness of the resulting wind power curve, improving the cleaning efficiency and the accuracy of the abnormal data, and reducing the requirement on the prior knowledge on the premise of not depending on the reference image, and realizing the effect of distinguishing various types of abnormal data.
Example two
Fig. 3 is a schematic structural diagram of an abnormal data device according to a second embodiment of the present invention. As shown in fig. 3, the apparatus includes: a data acquisition module 210, an abnormal data filtering module 220, and a target operational data determination module 230.
The data acquisition module 210 is configured to acquire original operation data of the wind turbine generator; generating an original wind power curve image based on the original operation data;
the abnormal data filtering module 220 is configured to mark and filter first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the original wind power curve image, so as to obtain operation data to be processed;
the target operation data determining module 230 is configured to generate a wind power curve image to be processed based on the operation data to be processed, and mark and filter third type of abnormal data included in the operation data to be processed based on the wind power curve image to be processed, so as to obtain target operation data corresponding to the wind turbine generator.
According to the technical scheme, the original operation data of the wind turbine generator are obtained; and generating an original wind power curve image based on the original operation data, further, marking and filtering first-type abnormal data and/or second-type abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain to-be-processed operation data, finally, generating a to-be-processed wind power curve image based on the to-be-processed operation data, marking and filtering third-type abnormal data included in the to-be-processed operation data based on the to-be-processed wind power curve image to obtain target operation data corresponding to the wind turbine generator, solving the problems that in the prior art, when stacked abnormal data are processed, larger errors are generated, part of abnormal data are ignored, normal data are deleted mistakenly, reference images and priori knowledge are depended on, and the like, realizing the effects of accurately detecting and marking different types of abnormal data, and maximally guaranteeing smoothness of the resulting wind power curve, improving the cleaning efficiency and the accuracy of the abnormal data, and reducing the requirement on the prior knowledge on the premise of not depending on the reference image, and realizing the effect of distinguishing various types of abnormal data.
Optionally, the abnormal data filtering module 220 includes: and a first type of abnormal data filtering unit.
The first type abnormal data filtering unit is used for marking and filtering the first type abnormal data included in the original operation data based on the original operation data and a predetermined first type abnormal data marking condition.
Optionally, the abnormal data filtering module 220 further includes: the system comprises an original wind power curve image conversion unit and a second type abnormal data filtering unit.
The original wind power curve image conversion unit is used for carrying out color space conversion on the original wind power curve image to obtain an image to be processed;
and the second type abnormal data filtering unit is used for marking and filtering the second type abnormal data included in the original operation data based on the image to be processed.
Optionally, the second type of abnormal data filtering unit includes: a target pixel point determining subunit and a second type abnormal data filtering subunit.
A target pixel point determining subunit, configured to compare a pixel value of each pixel point in the image to be processed with a preset pixel value interval, and determine at least one target pixel point;
And the second type abnormal data filtering subunit is used for mapping each target pixel point into the original wind power curve image and marking, and taking the marked original operation data as the second type abnormal data and filtering.
Optionally, the target operation data determining module 230 includes: the device comprises a mask image determining unit, a data positioning range determining unit and a third type of abnormal data determining unit.
The mask image determining unit is used for processing the wind power curve image to be processed according to an edge detection algorithm to obtain a wind power curve mask image;
the data positioning range determining unit is used for determining the data positioning range of the third type of abnormal data in the power curve image to be processed according to the wind power curve mask image;
and the third type abnormal data determining unit is used for determining the third type abnormal data based on the data positioning range and marking and filtering.
Optionally, the data positioning range determining unit includes: and the wind power curve mask image processing subunit and the power curve boundary determining subunit are used for determining the wind power curve.
The wind power curve mask image processing subunit is used for processing the wind power curve mask image based on Hough transformation and determining a horizontal straight line included in the wind power curve mask image;
The power curve boundary determining subunit is used for determining a power curve boundary in the wind power curve mask image according to the wind power curve mask image and a pre-constructed edge polynomial function;
and the data positioning range determining subunit is used for determining the data positioning range of the third type of abnormal data in the to-be-processed power curve image based on the horizontal straight line and the power curve boundary.
Optionally, the apparatus further includes: and the wind power prediction module.
And the wind power prediction module is used for storing the target operation data so as to predict the wind power of the wind turbine generator in a preset time period after the current moment based on the target operation data.
The abnormal data cleaning device provided by the embodiment of the invention can execute the abnormal data cleaning method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example III
Fig. 4 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, such as the abnormal data cleaning method.
In some embodiments, the abnormal data cleaning method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the abnormal data cleaning method described above may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the exception data cleaning method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (10)
1. An abnormal data cleaning method, comprising:
acquiring original operation data of a wind turbine generator, and generating an original wind power curve image based on the original operation data;
marking and filtering first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the wind power curve image to obtain operation data to be processed;
Generating a wind power curve image to be processed based on the operation data to be processed, and marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed to obtain target operation data corresponding to the wind turbine generator.
2. The method of claim 1, wherein the marking and filtering the first type of anomaly data included in the raw operational data based on the raw operational data and the raw wind power curve image comprises:
marking and filtering the first type of abnormal data included in the original operation data based on the original operation data and a predetermined first type of abnormal data marking condition.
3. The method of claim 1, the marking and filtering second type of anomaly data included in the raw operational data based on the raw operational data and the raw wind power curve image, comprising:
performing color space transformation on the original wind power curve image to obtain an image to be processed;
and marking and filtering second-class abnormal data included in the original operation data based on the image to be processed.
4. A method according to claim 3, wherein said marking and filtering the second type of abnormal data included in the original operation data based on the image to be processed comprises:
comparing the pixel value of each pixel point in the image to be processed with a preset pixel value interval to determine at least one target pixel point;
and mapping each target pixel point into the original wind power curve image, marking, taking marked original operation data as the second type of abnormal data, and filtering.
5. The method according to claim 1, wherein the marking and filtering of the third type of abnormal data included in the operational data to be processed based on the wind power curve image to be processed includes:
processing the wind power curve image to be processed according to an edge detection algorithm to obtain a wind power curve mask image;
determining a data positioning range of the third type of abnormal data in the power curve image to be processed according to the wind power curve mask image;
and determining the third abnormal data based on the data positioning range, and marking and filtering.
6. The method according to claim 5, wherein determining a data positioning range of the third type of abnormal data in the power curve image to be processed according to the wind power curve mask image comprises:
processing the wind power curve mask image based on Hough transformation, and determining a horizontal straight line included in the wind power curve mask image;
determining a power curve boundary in the wind power curve mask image according to the wind power curve mask image and a pre-constructed edge polynomial function;
and determining the data positioning range of the third type of abnormal data in the to-be-processed power curve image based on the horizontal straight line and the power curve boundary.
7. The method as recited in claim 1, further comprising:
and storing the target operation data to predict the wind power of the wind turbine generator within a preset time period after the current moment based on the target operation data.
8. An abnormal data cleaning device, comprising:
the data acquisition module is used for acquiring the original operation data of the wind turbine generator; generating an original wind power curve image based on the original operation data;
The data filtering module is used for marking and filtering first-class abnormal data and/or second-class abnormal data included in the original operation data based on the original operation data and the original wind power curve image to obtain operation data to be processed;
the target operation data determining module is used for generating a wind power curve image to be processed based on the operation data to be processed, marking and filtering third abnormal data included in the operation data to be processed based on the wind power curve image to be processed, and obtaining target operation data corresponding to the wind turbine generator.
9. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the abnormal data cleaning method of any one of claims 1-7.
10. A computer readable storage medium storing computer instructions for causing a processor to implement the abnormal data cleaning method of any one of claims 1-7 when executed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310313337.7A CN116304573A (en) | 2023-03-27 | 2023-03-27 | Abnormal data cleaning method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310313337.7A CN116304573A (en) | 2023-03-27 | 2023-03-27 | Abnormal data cleaning method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116304573A true CN116304573A (en) | 2023-06-23 |
Family
ID=86814900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310313337.7A Pending CN116304573A (en) | 2023-03-27 | 2023-03-27 | Abnormal data cleaning method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116304573A (en) |
-
2023
- 2023-03-27 CN CN202310313337.7A patent/CN116304573A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112418722B (en) | Non-invasive load identification method based on V-I (velocity-amplitude) trajectory graph and neural network | |
CN111864896B (en) | Power load monitoring method and system | |
Wang et al. | Recognition and drop-off detection of insulator based on aerial image | |
CN113205063A (en) | Visual identification and positioning method for defects of power transmission conductor | |
CN107679495B (en) | Detection method for movable engineering vehicles around power transmission line | |
CN110763685B (en) | Artificial intelligent detection method and device for DFB semiconductor laser chip surface defects | |
Long et al. | An abnormal wind turbine data cleaning algorithm based on color space conversion and image feature detection | |
CN107179479A (en) | Transmission pressure broken lot defect inspection method based on visible images | |
CN116824517B (en) | Substation operation and maintenance safety control system based on visualization | |
CN116304798A (en) | Partial discharge type identification method, device, equipment and medium | |
Lin et al. | License plate recognition based on mathematical morphology and template matching | |
CN112131956A (en) | Voltage sag source classification method based on difference hash algorithm | |
Su et al. | Wind power curve data cleaning algorithm via image thresholding | |
CN115294352A (en) | Intelligent switch cabinet state identification system and method based on image identification | |
CN108734709B (en) | Insulator flange shape parameter identification and damage detection method | |
CN118172334A (en) | Electric network wiring diagram electric element cascade detection method based on transducer and residual convolution | |
CN112507956B (en) | Signal lamp identification method and device, electronic equipment, road side equipment and cloud control platform | |
CN116304573A (en) | Abnormal data cleaning method and device, electronic equipment and storage medium | |
CN111145109B (en) | Wind power generation power curve abnormal data identification and cleaning method based on image | |
CN112991308B (en) | Image quality determining method and device, electronic equipment and medium | |
Zhang et al. | The automatic identification method of switch state | |
Wan | Image Recognition Method of HOG Algorithm Based on Grid Cell Memory | |
Hu et al. | Extraction of high voltage transmission lines based on morphological processing | |
CN118570231A (en) | Image segmentation method and device, electronic equipment and storage medium | |
Zhang et al. | Anomaly detection method of substation equipment based on CenterNet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |