CN117240217A - Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings - Google Patents

Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings Download PDF

Info

Publication number
CN117240217A
CN117240217A CN202311239309.1A CN202311239309A CN117240217A CN 117240217 A CN117240217 A CN 117240217A CN 202311239309 A CN202311239309 A CN 202311239309A CN 117240217 A CN117240217 A CN 117240217A
Authority
CN
China
Prior art keywords
string
distance
determining
group
determined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311239309.1A
Other languages
Chinese (zh)
Inventor
何磊
高超
张家前
周冰钰
方振宇
张锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sunshine Zhiwei Technology Co ltd
Original Assignee
Sunshine Zhiwei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunshine Zhiwei Technology Co ltd filed Critical Sunshine Zhiwei Technology Co ltd
Priority to CN202311239309.1A priority Critical patent/CN117240217A/en
Publication of CN117240217A publication Critical patent/CN117240217A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02EREDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
    • Y02E10/00Energy generation through renewable energy sources
    • Y02E10/50Photovoltaic [PV] energy

Landscapes

  • Testing Of Individual Semiconductor Devices (AREA)

Abstract

The invention discloses a fault diagnosis method, equipment and a computer readable storage medium of a photovoltaic string, wherein the method comprises the following steps: determining characteristic values of the groups of strings to be diagnosed according to the current time sequence of the groups of strings to be diagnosed; performing multiple clustering analysis on each group string to be diagnosed based on the characteristic values to obtain multiple clustering results, wherein each clustering result comprises two categories; according to the group string current of the group string to be diagnosed in each category after each cluster analysis, determining a pseudo positive class and a pseudo negative class after each cluster analysis; and determining a normal group string set according to each pseudo-positive class after each cluster analysis, and determining a fault group string set according to each pseudo-negative class after each cluster analysis. And carrying out fault diagnosis on the photovoltaic string by combining the current time sequence of the string to be diagnosed, and improving the accuracy of the fault diagnosis result of the photovoltaic string.

Description

Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for diagnosing faults of a photovoltaic string, and a computer readable storage medium.
Background
The photovoltaic group string refers to connecting a plurality of photovoltaic modules in series according to a certain circuit connection mode so as to adapt to specific voltage and power requirements. The voltage and current superposition of a plurality of photovoltaic modules can be realized through the photovoltaic group strings, so that the output voltage and power of the whole solar photovoltaic system are increased. In the actual operation process, due to the quality problem of the photovoltaic module and the influence of external environmental factors such as weather, temperature and the like, abnormal power generation of the photovoltaic module strings occurs frequently. The traditional fault diagnosis mode of the photovoltaic string is that the unmanned aerial vehicle collects the images of the photovoltaic string, and the fault diagnosis is carried out on the photovoltaic string through the image analysis result. However, in this method, whether or not the appearance of the photovoltaic string is defective can be detected, and the detection method is limited, which results in a decrease in the accuracy of the failure diagnosis result of the photovoltaic string.
Disclosure of Invention
The embodiment of the application aims to diagnose faults of a photovoltaic string by combining a current time sequence of the string to be diagnosed and improve the accuracy of the fault diagnosis result of the photovoltaic string by providing a fault diagnosis method, equipment and a computer readable storage medium of the photovoltaic string.
The embodiment of the application provides a fault diagnosis method of a photovoltaic string, which comprises the following steps:
Determining characteristic values of the groups of strings to be diagnosed according to the current time sequence of the groups of strings to be diagnosed;
performing multiple clustering analysis on each group string to be diagnosed based on the characteristic values to obtain multiple clustering results, wherein each clustering result comprises two categories;
according to the group string current of the group string to be diagnosed in each category after each cluster analysis, determining a pseudo positive class and a pseudo negative class after each cluster analysis;
and determining a normal group string set according to each pseudo-positive class after each cluster analysis, and determining a fault group string set according to each pseudo-negative class after each cluster analysis.
Optionally, the step of performing multiple clustering analysis on each group string to be diagnosed based on the feature value to obtain multiple clustering results includes:
randomly selecting two groups of strings to be diagnosed from all groups of strings to be diagnosed as clustering centers when in each clustering analysis, and regarding the groups of strings to be diagnosed except for the two groups of strings to be diagnosed as other groups of strings, wherein each clustering center represents a category;
determining distance values between the other groups of strings and each cluster center according to the characteristic values of the cluster centers and the characteristic values of the other groups of strings;
And determining the clustering result according to the distance value.
Optionally, the step of determining the clustering result according to the distance value includes:
selecting a minimum distance value from all distance values between the other groups of strings and each cluster center;
determining the class of the clustering center corresponding to the minimum distance value;
and classifying the other groups of strings into the class of the clustering center corresponding to the minimum distance value to obtain the clustering result.
Optionally, before the step of determining the pseudo-positive class and the pseudo-negative class after each cluster analysis according to the group string currents of the group strings to be diagnosed in each class after each cluster analysis, the method further includes:
determining the cluster difference degree corresponding to each clustering analysis according to the characteristic value of the group string to be diagnosed in each category after each clustering analysis;
determining the inter-cluster difference degree when the inter-cluster difference degree is larger than the preset difference degree in all the inter-cluster difference degrees;
deleting the clustering result corresponding to the inter-cluster difference degree when the inter-cluster difference degree is larger than the preset difference degree, and taking the rest clustering result as the plurality of clustering results.
Optionally, the step of determining the pseudo-positive class and the pseudo-negative class after each cluster analysis according to the group string current of the group string to be diagnosed in each class after each cluster analysis includes:
Aiming at each clustering analysis, determining a group string current average value corresponding to each category after the clustering analysis according to the group string current of the group string to be diagnosed in each category after the clustering analysis;
selecting a maximum group of serial current average values and a minimum group of serial current average values from the group of serial current average values corresponding to each category after the clustering analysis;
and taking the maximum group of serial current average values as pseudo-positive classes after the clustering analysis, and taking the minimum group of serial current average values as pseudo-negative classes after the clustering analysis.
Optionally, the step of determining the normal group string set according to each pseudo-positive class after each cluster analysis and determining the fault group string set according to each pseudo-negative class after each cluster analysis includes:
and determining a normal group string set according to the intersection of each pseudo-positive class after each cluster analysis, and determining a fault group string set according to the intersection of each pseudo-negative class after each cluster analysis.
Optionally, the fault diagnosis method of the photovoltaic string further includes:
combining all to-be-diagnosed group strings except the normal group string set and the fault group string set into a to-be-determined group string set;
Determining local subspaces from the normal group string set and the fault group string set according to the characteristic values of each group string to be determined in the group string set to be determined;
generating a local subspace distance set according to the distance between the local subspace and each string to be determined in the string set to be determined;
obtaining an evaluation result based on the local subspace distance set, and determining a target to-be-determined group string set according to the evaluation result;
determining a local distance matrix according to the to-be-determined group string set and the target to-be-determined group string set;
and determining the category of each group string to be determined in the group string set to be determined according to the local distance matrix.
Optionally, the step of determining the local subspace from the normal group string set and the fault group string set according to the characteristic value of each group string to be determined in the group string set to be determined includes:
determining the number and the dimension of the feature space;
mapping the characteristic values of each group string to be judged in the group string set to be judged to each characteristic space;
in each feature space, determining the distance between each string to be determined in the string set to be determined and each string to be determined in the normal string set and the fault string set according to the feature value of each string to be determined in the string set to be determined and the feature value of each string to be photovoltaic in the normal string set and the fault string set;
Selecting a photovoltaic group string closest to each string to be determined in the string set to be determined from the normal string set to form a first neighbor string set, and selecting a photovoltaic group string closest to each string to be determined in the string set to be determined from the fault string set to form a second neighbor string set;
determining a photovoltaic group string with the occurrence frequency larger than a preset frequency in the first neighbor group string set as a positive local subspace, and determining a photovoltaic group string with the occurrence frequency larger than the preset frequency in the second neighbor group string set as a negative local subspace, wherein the preset frequency is half of the number of the feature spaces;
and taking the positive local subspace and the negative local subspace as the local subspace.
Optionally, the step of obtaining an evaluation result based on the local subspace distance set and determining the target to-be-determined group string set according to the evaluation result includes:
determining a lower limit value and an upper limit value based on the local subspace distance set;
when the distance between the upper limit value and the lower limit value exists in the local subspace distance set, determining that the evaluation result is that the photovoltaic group strings in the local subspace are combined into the target group string set to be determined through evaluation;
And when the distance existing in the local subspace distance set is larger than the upper limit value or smaller than the lower limit value, determining that the evaluation result is not passed, screening the photovoltaic string in the local subspace, and taking the screened local subspace as the target string set to be determined.
Optionally, the step of screening the photovoltaic string in the local subspace, and taking the screened local subspace as the target string set to be determined includes:
generating a set to be removed according to the photovoltaic group strings with the distance larger than the upper limit value or smaller than the lower limit value in the local subspace, and generating a set to be screened according to the photovoltaic group strings except the set to be removed in the local subspace;
determining the distance between each photovoltaic group string in the set to be screened and each photovoltaic group string in the set to be rejected;
and if the number of the photovoltaic group strings with the distance smaller than the preset distance is larger than the preset number, eliminating the photovoltaic group strings corresponding to the distance smaller than the preset distance from the set to be screened, and taking the eliminated set to be screened as the target set to be judged.
Optionally, the step of determining the local distance matrix according to the to-be-determined group string set and the target to-be-determined group string set includes:
determining a positive subspace group string set and a negative subspace group string set according to the target to-be-determined group string set;
determining a first distance between each photovoltaic string in the positive subspace string set and each string to be determined in the string set to be determined, and determining a second distance between each photovoltaic string in the negative subspace string set and each string to be determined in the string set to be determined;
generating a positive local distance matrix according to each first distance, and generating a negative local distance matrix according to each second distance;
the positive local distance matrix and the negative local distance matrix are determined as the local distance matrix.
Optionally, the step of determining the category of each string to be determined in the string set to be determined according to the local distance matrix includes:
determining a first maximum distance of each row in the positive local distance matrix, generating a positive pseudo-distance vector of each row according to the first maximum distance of each row, and determining a positive pseudo-distance vector average value according to each positive pseudo-distance vector; the method comprises the steps of,
Determining a second maximum distance of each row in the negative local distance matrix, generating a negative pseudo-distance vector of each row according to the second maximum distance of each row, and determining a negative pseudo-distance vector average value according to each negative pseudo-distance vector;
and determining the class corresponding to the minimum mean value in the positive pseudo distance vector mean value and the negative pseudo distance vector mean value as the class of the group string to be determined in the group string set to be determined.
In addition, to achieve the above object, the present application also provides a fault diagnosis apparatus for a photovoltaic string, including: the method comprises the steps of a memory, a processor and a photovoltaic string fault diagnosis program which is stored in the memory and can run on the processor, wherein the photovoltaic string fault diagnosis program is executed by the processor to realize the photovoltaic string fault diagnosis method.
In addition, in order to achieve the above object, the present application also provides a storage medium having stored thereon a failure diagnosis program of a photovoltaic string, which when executed by a processor, implements the steps of the failure diagnosis method of a photovoltaic string described above.
According to the technical scheme of the fault diagnosis method, the equipment and the computer readable storage medium for the photovoltaic string, provided by the embodiment of the application, because the clustering analysis can be carried out on each string to be diagnosed based on the characteristic value of the current time sequence of each string to be diagnosed, the clustering result of each clustering analysis is obtained, and whether the actual operation process of the photovoltaic string is faulty or not can be diagnosed. In order to improve accuracy of the diagnosis result, after the clustering result is obtained each time, distinguishing pseudo positive classes from pseudo negative classes, determining a normal group string set according to each pseudo positive class after each clustering analysis, and determining a fault group string set according to each pseudo negative class after each clustering analysis, so that the group strings to be diagnosed are prevented from being wrongly classified into the normal group string set and the fault group string set, and accurate identification and diagnosis of the normal group strings and the fault group strings are realized.
Drawings
FIG. 1 is a schematic flow chart of a first embodiment of a method for diagnosing faults in a photovoltaic string according to the present application;
FIG. 2 is a schematic flow chart of a second embodiment of a method for diagnosing faults in a photovoltaic string according to the present application;
FIG. 3 is a schematic flow chart of a third embodiment of a method for diagnosing faults in a photovoltaic string according to the present application;
FIG. 4 is a flowchart of a fourth embodiment of a method for diagnosing faults in a photovoltaic string according to the present application;
FIG. 5 is a schematic flow chart of a fifth embodiment of a fault diagnosis method for a photovoltaic string according to the present application;
fig. 6 is a schematic structural diagram of a hardware running environment according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to embodiments, with reference to the accompanying drawings, which are only illustrations of one embodiment, but not all of the applications.
Detailed Description
The photovoltaic group string refers to connecting a plurality of photovoltaic modules in series according to a certain circuit connection mode so as to adapt to specific voltage and power requirements. The voltage and current superposition of a plurality of photovoltaic modules can be realized through the photovoltaic group strings, so that the output voltage and power of the whole solar photovoltaic system are increased. In the actual operation process, due to the quality problem of the photovoltaic module and the influence of external environmental factors such as weather, temperature and the like, abnormal power generation of the photovoltaic module strings occurs frequently. For example, photovoltaic strings exhibit surface discoloration due to quality problems, resulting in reduced power generation efficiency. Or when the photovoltaic string is in severe weather or under complex terrain, the photovoltaic string is highly likely to be broken by external force, so that the power generation efficiency is suddenly reduced. Therefore, real-time fault diagnosis of the photovoltaic string is necessary. The traditional fault diagnosis mode of the photovoltaic string is that the unmanned aerial vehicle collects the images of the photovoltaic string, and the fault diagnosis is carried out on the photovoltaic string through the image analysis result. However, in this way, it is only detected whether the appearance of the photovoltaic string is defective, and it is determined whether the photovoltaic string is faulty. The photovoltaic string may also be due to damage to the device or other inherent factors that may lead to failure that cannot be analyzed from the image. Therefore, the related art fault detection method has limitations, resulting in a decrease in accuracy of the fault diagnosis result of the photovoltaic string.
Aiming at the technical problems, the application provides a novel fault diagnosis method of a photovoltaic string, which comprises the steps of determining characteristic values of each string to be diagnosed according to current time sequences of the strings to be diagnosed; performing multiple clustering analysis on each group string to be diagnosed based on the characteristic values to obtain multiple clustering results, wherein each clustering result comprises two categories; according to the group string current of the group string to be diagnosed in each category after each cluster analysis, determining a pseudo positive class and a pseudo negative class after each cluster analysis; and determining a normal group string set according to each pseudo-positive class after each cluster analysis, and determining a fault group string set according to each pseudo-negative class after each cluster analysis. Because the clustering analysis can be carried out on each group string to be diagnosed based on the characteristic value of the current time sequence of each group string to be diagnosed, the clustering result of each clustering analysis is obtained, and whether the actual operation process of the photovoltaic group string is faulty or not can be diagnosed. In order to improve accuracy of the diagnosis result, after the clustering result is obtained each time, distinguishing pseudo positive classes from pseudo negative classes, determining a normal group string set according to each pseudo positive class after each clustering analysis, and determining a fault group string set according to each pseudo negative class after each clustering analysis, so that the group strings to be diagnosed are prevented from being wrongly classified into the normal group string set and the fault group string set, and accurate identification and diagnosis of the normal group strings and the fault group strings are realized.
In addition, compared with the prior art that the fault diagnosis of the photovoltaic string is carried out in an unmanned aerial vehicle inspection mode, the method and the device only need to carry out the fault diagnosis of the photovoltaic string according to the current time sequence of each string to be diagnosed, are convenient and quick, and reduce the fault diagnosis cost of the photovoltaic string.
In order to better understand the above technical solution, exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in fig. 1, in a first embodiment of the present application, the fault diagnosis method of the photovoltaic string of the present application is applied to a fault diagnosis device of the photovoltaic string, where the fault diagnosis device of the photovoltaic string may be a smart phone, a computer, or other devices with data processing functions. The fault diagnosis method of the photovoltaic string comprises the following steps:
step S110, determining the characteristic value of each group string to be diagnosed according to the current time sequence of each group string to be diagnosed.
In this embodiment, the string to be diagnosed may be manually selected and determined, and all or some photovoltaic strings in a certain photovoltaic power station may be manually selected; and the unmanned aerial vehicle can also collect the image of the photovoltaic string, analyze the image of the photovoltaic string to detect whether the appearance of the photovoltaic string is defective, and select the photovoltaic string with the defective appearance as the string to be diagnosed. The photovoltaic string that operates within a certain period of time may also be determined as a string to be diagnosed, for example, a photovoltaic string that operates within a certain period of rain is determined as a string to be diagnosed, and a photovoltaic string that operates within a period of time when a certain temperature reaches a preset temperature is determined as a string to be diagnosed. The string to be diagnosed can also be selected in other ways.
The current time sequence refers to a current signal changing with time, and is used for helping to understand current behaviors, analyze performances and faults of a circuit, evaluate power quality and the like. The current timing may be collected and recorded by various means including sensors, oscilloscopes, data acquisition cards, and the like. Generally, the current timing is shown in the form of a time domain waveform, where the horizontal axis represents time and the vertical axis represents the magnitude of the current. There is a corresponding current timing for each string of groups to be diagnosed. The current time sequence can be a historical current time sequence of the group strings to be diagnosed, each group string to be diagnosed can upload own group string current to a local database or a cloud in real time in the actual operation process, and when fault diagnosis analysis is carried out subsequently, the historical current time sequence is obtained from the local database or the cloud for fault analysis. For example, the string current of all branches at the past one month, overcast and rainy days hour level is obtained, and the current data is at 9:00-15: 00.
The characteristic value is obtained after the current characteristic extraction of the current time sequence. The characteristic values of the present application include statistical characteristic values and time series characteristic values. The statistical characteristic values comprise a maximum value, a minimum value, an average value, a variance and a peak distance in the current time sequence, wherein the variance is the average value of the square of the difference value between the variance and the average value calculated for each point in the current time sequence, and the peak distance is the maximum value in the current time sequence minus the minimum value in the current time sequence. The timing characteristic values include a lag correlation, a sliding window, a differential characteristic value, and the like, for example, the lag correlation is a correlation of a last day of calculating the current timing and a first day of starting the current timing; the length of the sliding window is 3, and the average value, the maximum value, the minimum value and the standard deviation in the window can be counted respectively; the differential characteristic value is the first-order differential of the calculated current time sequence, and the average value, the maximum value, the minimum value and the standard deviation after the differential are calculated.
After the current time sequences of the groups to be diagnosed are obtained, the current time sequences of the groups to be diagnosed are respectively subjected to statistical analysis to obtain the characteristic values corresponding to the groups to be diagnosed. According to the application, the characteristic values corresponding to the groups to be diagnosed are extracted, so that the subsequent cluster analysis is convenient, and the groups to be diagnosed are classified.
In other embodiments, after the current time sequence of each group string to be diagnosed is obtained, data preprocessing is performed on the current time sequence of each group string to be diagnosed, and the characteristic value of each group string to be diagnosed is determined according to the current time sequence after the data preprocessing of each group string to be diagnosed. The data preprocessing of each current time sequence to be diagnosed mainly processes noise data such as a dead value, an interruption, a limit value and the like of communication so as to reduce the interference of the noise data on a characteristic value obtained by subsequent analysis. The data preprocessing includes data filtering, data filling, data alignment, etc. For example, the data filtering includes: filtering all zero value group strings, filtering continuous 6-hour constant value group strings, filtering presence threshold value group strings (greater than rated current value), filtering unconnected group strings, and the like. For example, the data population and data alignment include: mean filling is used for the presence of few (not more than 10% of the number of data points) deletions and threshold violations; and taking the mode of the number of data complete days of the processed data as a current time sequence calculated later, deleting the group of string data if the number of data complete days is smaller than the mode, and taking the data of the number of days which is the same as the mode recently and larger than the mode.
And step S120, carrying out multiple clustering analysis on each group string to be diagnosed based on the characteristic values to obtain multiple clustering results, wherein each clustering result comprises two categories.
In this embodiment, after obtaining the feature values of each group string to be diagnosed, cluster analysis is performed on each group string to be diagnosed based on the feature values, so as to obtain a cluster result. The application adopts a cluster analysis algorithm to carry out cluster analysis on each group string to be diagnosed, and a cluster result is obtained. Cluster analysis is an unsupervised learning method aimed at dividing each string of groups to be diagnosed into different groups or clusters with similar features. Alternatively, the clustering analysis algorithm adopted by the application can be K-means clustering, hierarchical clustering, density clustering and the like, and one of the clustering analysis algorithms can be selected for clustering analysis according to actual conditions. In the application, the clustering analysis is carried out by taking a K-means clustering algorithm as an example, the K value is set to be 2, and each group to be diagnosed is clustered into two categories.
Because an optimal clustering result may not be obtained by one iteration, in the process of cluster analysis, iteration optimization is required to be continuously performed on each group string to be diagnosed, and the iteration process is a process of dividing the group string to be diagnosed into clusters to which the nearest cluster center belongs, so that classification of each group string to be diagnosed is realized. Each iteration optimization generates a corresponding clustering result, and the clustering result comprises the category of each group string to be diagnosed after each iteration optimization.
Step S130, according to the group string current of the group string to be diagnosed in each category after each cluster analysis, determining the pseudo positive class and the pseudo negative class after each cluster analysis.
In this embodiment, when the clustering algorithm incorrectly classifies a true negative-class group string to be diagnosed into a certain cluster and misclassifies the true negative-class group string to be diagnosed into a positive class, a pseudo positive class is formed, which means that the clustering algorithm incorrectly classifies the group string to be diagnosed which does not belong to the positive class as the positive class. Conversely, if the clustering algorithm does not correctly cluster the true positive class strings, but rather disperses some of the strings into other clusters, resulting in the strings being misclassified as negative classes, then a pseudo negative class is generated. This means that the clustering algorithm fails to capture the correlation of all positive class strings to be diagnosed. Therefore, after each clustering analysis, the application also needs to distinguish the pseudo negative class from the pseudo positive class, analyze the pseudo negative class and the pseudo positive class, improve the recognition precision of the fault group string and the normal group string, and reduce the influence of the defects of the clustering algorithm on the fault diagnosis result.
Step S140, determining a normal group string set according to each pseudo-positive class after each cluster analysis, and determining a fault group string set according to each pseudo-negative class after each cluster analysis.
Optionally, after obtaining the pseudo-positive class and the pseudo-negative class after each cluster analysis, determining a normal group string set according to the intersection between the pseudo-positive classes after each cluster analysis, and determining a fault group string set according to the intersection between the pseudo-negative classes after each cluster analysis.
The intersection of the pseudo-positive class and the intersection of the pseudo-negative class refer to two subsets formed by the strings of groups to be diagnosed in which the predicted result is inconsistent with the real label in the two classification problems. The false positive class refers to the group string to be diagnosed which is erroneously judged as a positive class, namely, the model predicts the group string to be diagnosed as a positive class, but the real label of the group string is a negative class, and the intersection of the false positive class refers to the set of the group string to be diagnosed which is erroneously judged as a positive class by different models or algorithms, and at the moment, the intersection of the false positive class is the set of the normal group string. The pseudo-negative class refers to the string of the group to be diagnosed that is erroneously determined to be a negative class, i.e., the model predicts it as a negative class, but its true label is a positive class. The intersection of the pseudo-negative classes refers to the set of all the strings to be diagnosed which are erroneously judged to be negative by different models or algorithms, and at the moment, the intersection of the pseudo-positive classes is the set of the fault strings.
The intersection of the pseudo-positive class and the intersection of the pseudo-negative class may provide some information about the performance of the classification model. By analyzing these intersections, it is possible to know which strings of groups to diagnose are challenging for the model, i.e., are prone to misclassification. The strings of the set to be diagnosed in these intersections may have special features, making it difficult for the model to accurately classify them into the correct categories. Analyzing the intersection of the pseudo-positive and pseudo-negative classes may help improve classification models, optimize feature selection, adjust model parameters, or use more complex algorithms to reduce misclassification. In addition, analysis of the intersection sets also facilitates evaluation and comparison of the models, understanding differences in their performance across different categories, and further improving the accuracy and stability of the models. The application determines the normal group string set through the intersection of each pseudo-positive class and determines the fault group string set according to the intersection of each pseudo-negative class, thereby reducing the error influence of the defects of the clustering algorithm on the diagnosis result and improving the accuracy of the fault diagnosis and the fault identification result.
According to the technical scheme, the cluster analysis can be performed on each group string to be diagnosed based on the characteristic value of the current time sequence of each group string to be diagnosed, so that the cluster result of each cluster analysis is obtained, and whether the actual operation process of the photovoltaic group string fails can be diagnosed. In order to improve accuracy of the diagnosis result, after the clustering result is obtained each time, distinguishing pseudo positive classes from pseudo negative classes, determining a normal group string set according to each pseudo positive class after each clustering analysis, and determining a fault group string set according to each pseudo negative class after each clustering analysis, so that the group strings to be diagnosed are prevented from being wrongly classified into the normal group string set and the fault group string set, and accurate identification and diagnosis of the normal group strings and the fault group strings are realized.
Further, referring to fig. 2, in the second embodiment of the present application, the step S120 of the present application specifically includes the steps of:
step S121, randomly selecting two groups of strings to be diagnosed from all groups of strings to be diagnosed as clustering centers when performing cluster analysis, and regarding the groups of strings to be diagnosed except for the two groups of strings to be diagnosed as other groups of strings, wherein each clustering center represents a category;
In the embodiment, the clustering algorithm is adopted to perform clustering analysis on each group string to be diagnosed, so as to obtain a clustering result. The application takes a clustering algorithm as a K-means clustering algorithm as an example. For center-based algorithms such as K-means clustering, an initial cluster center needs to be initialized. The initial cluster center may randomly select samples or use other heuristics. In each clustering analysis, the application randomly selects two groups of strings to be diagnosed from all groups of strings to be diagnosed as clustering centers to perform clustering analysis.
Step S122, determining distance values between the other groups of strings and each cluster center according to the feature values of the cluster centers and the feature values of the other groups of strings.
In this embodiment, in the process of performing cluster analysis, a similarity measurement index needs to be determined. First, an appropriate similarity measure is selected to measure the similarity between samples, such as euclidean distance, manhattan distance, correlation coefficient, etc. These metrics may be selected based on the nature of the problem and the type of data. The present application takes Euclidean distance, i.e. distance value, as an example.
And step S123, determining the clustering result according to the distance value.
In this embodiment, the clustering result is iteratively optimized continuously. And continuously optimizing the clustering result in an iterative mode. The algorithm calculates the distance between other groups of strings and the cluster center according to the similarity measurement index, and divides the other groups of strings into clusters to which the nearest cluster center belongs. And then updating the position of the clustering center according to the new division result, and iterating again until reaching the convergence condition. Wherein the convergence condition includes, but is not limited to: the square sum in the cluster is minimum, and the iteration times reach the preset times.
Optionally, selecting a minimum distance value from all distance values between the other group of strings and each cluster center; determining the class of the clustering center corresponding to the minimum distance value; and classifying the other groups of strings into the class of the clustering center corresponding to the minimum distance value to obtain the clustering result.
Illustratively, assume that the cluster center is set to 2, which is specifically as follows:
(1) Randomly selecting two samples to be diagnosed as a clustering center, and taking a characteristic value of the samples as the mass center of the clustering center.
(2) For all other sets of strings, euclidean distances to the two cluster centers are calculated, and the respective other sets of strings are grouped into categories of cluster centers of minimum distance values.
(3) Updating the centroid value of the clustering center, and calculating the average value of the current characteristic values of the class to be detected to be a new centroid value.
Repeating the steps (2) and (3) until a certain iteration number is reached, and ending the iteration of the clustering algorithm. The application realizes the classification of each group string to be diagnosed through the cluster analysis algorithm.
Optionally, after the operation of the clustering algorithm is finished, the quality of the clustering result can be evaluated. Common evaluation methods include indicators of similarity in clusters, differences between clusters, and the like. In addition, the clustering result can be visually displayed through a visual means. It is noted that cluster analysis is an unsupervised learning method, without predefined class labels. Therefore, the cluster analysis is mainly used for exploratory data analysis, pattern discovery, feature extraction and other tasks, and helps to understand the structure and characteristics of the data set. When cluster analysis is applied, proper similarity measurement indexes and a clustering algorithm are required to be selected according to specific problems, and proper parameter tuning and result evaluation are required to be performed.
Further, referring to fig. 3, in a third embodiment of the present application, before step S130, the method further includes the steps of:
Step S210, determining the inter-cluster difference degree corresponding to each cluster analysis according to the characteristic values of the group strings to be diagnosed in each category after each cluster analysis.
In this embodiment, the degree of difference between each category after each cluster analysis, that is, the degree of difference between clusters, is calculated according to the feature value of the group string to be diagnosed in each category after each cluster analysis. Specifically, in each cluster analysis process, the euclidean distance between two classes can be calculated according to the characteristic value of the group string to be diagnosed in each class after the cluster analysis, wherein the euclidean distance is used for representing the difference degree between clusters, when the euclidean distance is larger, the difference degree between clusters is larger, and otherwise, when the euclidean distance is smaller, the difference degree between clusters is smaller.
Step S220, determining the inter-cluster difference degree when the inter-cluster difference degree is larger than the preset difference degree in all the inter-cluster difference degrees;
step S230, deleting the clustering result corresponding to the inter-cluster difference degree when the difference degree is larger than the preset difference degree, and taking the rest clustering result as the plurality of clustering results.
In this embodiment, whether to retain the clustering result of each cluster analysis may be determined according to the inter-cluster difference degree corresponding to the cluster analysis. If the clustering results corresponding to the clusters with larger difference are adopted for subsequent analysis, the accuracy of the diagnosis analysis results is reduced. Therefore, the application sets a preset difference according to the actual situation, and deletes the clustering result corresponding to the difference between clusters when the difference between clusters is larger than the preset difference, thereby improving the accuracy of the diagnosis analysis result. When the difference is smaller than or equal to the preset difference, the difference of the group strings to be diagnosed in the two categories is smaller, and the subsequent operation of generating the local subspace can be performed.
Illustratively, for both categories, two categories centroid point C are calculated separately 1 And C 2 The euclidean distance of the eigenvalues of (a) is taken as the inter-cluster variability, namely:
where n is the number of features and k is the kth feature value.
After the calculation is completed, whether the difference degree among clusters exceeds a preset difference degree or not is detected. Calculating the cluster difference degree of each pair of positive and negative classes, calculating the mean value and the variance of N cluster difference degrees, and eliminating the cluster result with the cluster difference degree larger than the preset difference degree value from the N cluster results when the cluster difference degree is larger than the preset difference degree value. When the inter-cluster difference is smaller than or equal to the preset difference, determining pseudo positive and pseudo negative classes after each cluster analysis according to the group string current of the group strings to be diagnosed in each class after each cluster analysis, determining a normal group string set according to each pseudo positive class after each cluster analysis, and determining a fault group string set according to each pseudo negative class after each cluster analysis.
Optionally, the step of determining the pseudo-positive class and the pseudo-negative class after each cluster analysis according to the group string current of the group string to be diagnosed in each class after each cluster analysis includes: aiming at each clustering analysis, determining a group string current average value corresponding to each category after the clustering analysis according to the group string current of the group string to be diagnosed in each category after the clustering analysis; selecting a maximum group of serial current average values and a minimum group of serial current average values from the group of serial current average values corresponding to each category after the clustering analysis; and taking the maximum group of serial current average values as pseudo-positive classes after the clustering analysis, and taking the minimum group of serial current average values as pseudo-negative classes after the clustering analysis.
For example, at each cluster analysis, the group string current of the group string to be diagnosed in each category after the cluster analysis is obtained, and the group string current average value corresponding to each category is calculated according to the group string current of the group string to be diagnosed in each category. Comparing the group string current average values of the two classes, determining the maximum group string current average value as a pseudo-positive class after the clustering analysis, and determining the minimum group string current average value as a pseudo-negative class after the clustering analysis. The above operation is adopted for each cluster analysis, so that the corresponding pseudo negative class and pseudo positive class after each cluster analysis can be obtained. And determining which of the group strings to be diagnosed are fault group strings and which of the group strings to be diagnosed are normal group strings according to the pseudo positive class and the pseudo negative class after each clustering analysis, thereby achieving the purpose of fault diagnosis.
Further, referring to fig. 4, based on any of the first to third embodiments, in a fourth embodiment of the present application, after step S140 of the present application, the steps of:
step S310, combining all the strings to be diagnosed except the normal string set and the fault string set into a string set to be determined.
In this embodiment, due to the specificity of some strings to be diagnosed, for example, the strings to be diagnosed with a larger difference between the current change condition and the current change condition of other strings to be diagnosed, the strings to be diagnosed cannot be classified by the cluster analysis algorithm, so that effective diagnosis cannot be performed on the strings to be diagnosed. The application classifies the group strings to be diagnosed into group strings to be judged, and when a plurality of group strings to be judged exist, a group string set to be judged is generated. And obtaining the group strings to be diagnosed except the normal group string set and the fault group string set in all the group strings to be diagnosed, and combining the group strings to be diagnosed except the normal group string set and the fault group string set to form a group string set to be determined. The application generates a corresponding local subspace for each string to be determined in the string set to be determined.
Step S320, determining a local subspace from the normal group string set and the fault group string set according to the feature values of each group string to be determined in the group string set to be determined.
In this embodiment, when the number of dimensions of the feature values of each to-be-determined group string in the to-be-determined group string set is assumed to be d, for each to-be-determined group string in the to-be-determined group string set, a local subspace corresponding to each to-be-determined group string is generated.
Optionally, the determining, according to the characteristic value of each string to be determined in the string set to be determined, a local subspace from the normal string set and the fault string set includes: determining the distance between each string to be determined in the string set to be determined and each string to be determined in the normal string set and the failure string set according to the characteristic value of each string to be determined in the string set to be determined and the characteristic value of each string to be determined in the normal string set and the failure string set; selecting a photovoltaic group string which is closest to each group string to be judged in the group string set to be judged from the normal group string set, and determining the photovoltaic group string with occurrence times larger than preset times from the photovoltaic group string closest to the group string to be judged as a positive local subspace; selecting a photovoltaic group string closest to each group string to be judged in the group string set to be judged from the fault group string set, and determining the photovoltaic group string with occurrence times larger than preset times from the photovoltaic group strings closest to the group string to be judged as a negative local subspace; the positive local subspace and the negative local subspace are then considered to be local subspaces.
Further, determining the number and the dimension of the feature space; mapping the characteristic values of each group string to be judged in the group string set to be judged to each characteristic space; in each feature space, determining the distance between each string to be determined in the string set to be determined and each string to be determined in the normal string set and the fault string set according to the feature value of each string to be determined in the string set to be determined and the feature value of each string to be photovoltaic in the normal string set and the fault string set; selecting a photovoltaic group string closest to each string to be determined in the string set to be determined from the normal string set to form a first neighbor string set, and selecting a photovoltaic group string closest to each string to be determined in the string set to be determined from the fault string set to form a second neighbor string set; determining a photovoltaic group string with the occurrence frequency larger than a preset frequency in the first neighbor group string set as a positive local subspace, and determining a photovoltaic group string with the occurrence frequency larger than the preset frequency in the second neighbor group string set as a negative local subspace; and taking the positive local subspace and the negative local subspace as the local subspace. The preset times can be set according to actual conditions, and the photovoltaic group strings in the first neighbor group string set and the second neighbor group string set are different.
Illustratively, the specific process of generating a corresponding local subspace for each string of groups to be determined is as follows:
1. the t groups of d/2-d feature spaces are randomly selected.
2. For each selected group of feature space, K neighbor group strings closest to each string to be determined (Euclidean distance) in the group string set to be determined are respectively found out in the normal group string set and the fault group string set by utilizing the feature space.
3. And taking the set formed by all the found neighbor group strings and the group strings with the occurrence times exceeding t/2 (preset times) as the local subspace of the group string to be determined, wherein the local subspace comprises a positive local subspace and a negative local subspace. And determining a positive local subspace from the photovoltaic group strings selected from the normal group string set, and determining a negative local subspace from the photovoltaic group strings selected from the fault group string set.
As can be seen from the above examples, the preset number of times is half the number of feature spaces.
Step S330, generating a local subspace distance set according to the distance between the local subspace and each string to be determined in the string set to be determined.
In this embodiment, the local subspace includes a positive local subspace and a negative local subspace, and accordingly, the local subspace distance set may also include a positive local subspace distance set and a negative local subspace distance set. Generating a positive local subspace distance set according to the distance between the positive local subspace and each string to be determined in the string set to be determined; generating a negative local subspace distance set according to the distance between the negative local subspace and each string to be determined in the string set to be determined; the positive and negative local subspace distance sets are considered to be local subspace distance sets. Thus, distance sets of different local subspaces are generated, and subsequent local subspace evaluation is facilitated.
For the positive local subspace, the distance between each photovoltaic group string in the positive local subspace and each group string to be determined in the group string set to be determined is calculated, and a positive local subspace distance set is generated. And respectively calculating the distance between each photovoltaic group string in the negative local subspace and each group string to be determined in the group string set to be determined aiming at the negative local subspace, and generating a negative local subspace distance set.
And step S340, obtaining an evaluation result based on the local subspace distance set, and determining a target to-be-determined group string set according to the evaluation result.
In this embodiment, after the local subspace distance set is obtained, local subspace evaluation is performed based on the local subspace distance set. Specifically, after the local subspace distance set is obtained, the upper limit value and the lower limit value are obtained in the local subspace distance set using a box-line graph method. And comparing the data in the local subspace with the upper limit value and the lower limit value to evaluate the local subspace, thereby obtaining an evaluation result. The evaluation results include pass evaluation and fail evaluation. By the evaluation result, which photovoltaic group strings in the local subspace can pass through the evaluation, and which photovoltaic group strings can not pass through the evaluation can be obtained. When the photovoltaic group strings in the local subspace pass through the evaluation and fail the evaluation, the photovoltaic group strings in the corresponding target group string set to be judged are different. And the target to-be-determined group string set is used for determining a subsequent local distance matrix.
Optionally, an evaluation result is obtained based on the positive local subspace distance set, and a target to-be-determined group string set is determined according to the evaluation result. And obtaining an evaluation result based on the negative local subspace distance set, and determining a target to-be-determined group string set according to the evaluation result.
Optionally, the obtaining an evaluation result based on the local subspace distance set, and determining the target to-be-determined group string set according to the evaluation result includes: determining a lower limit value and an upper limit value based on the local subspace distance set; when the distance between the upper limit value and the lower limit value exists in the local subspace distance set, determining that the evaluation result is that the photovoltaic group strings in the local subspace are combined into the target group string set to be determined through evaluation; and when the distance existing in the local subspace distance set is larger than the upper limit value or smaller than the lower limit value, determining that the evaluation result is not passed, screening the photovoltaic string in the local subspace, and taking the screened local subspace as the target string set to be determined.
For the positive local subspace distance set or the negative local subspace distance set, a box diagram method can be adopted to determine an upper limit value and a lower limit value respectively. The upper and lower values are determined according to a box diagram method. The box plot, also called box whisker plot, is a graph showing the distribution of data. The method can effectively reflect the characteristics of the central position, the discrete degree, the abnormal value and the like of the data set. The box plot method generally includes the following elements:
Upper edge: referring to the upper boundary of the bin, 75% quantile (Q3) of the data is indicated, i.e., 75% of the data has a value less than or equal to this value.
The lower edge: referring to the lower boundary of the bin, a 25% quantile (Q1) of the data is represented, i.e., the 25% value in the data is less than or equal to this value.
Median line: the horizontal line inside the box represents the 50% quantile (Q2) of the data, i.e., the value of 50% of the data is less than or equal to this value.
The box: refers to the rectangular area between the upper and lower edges, whose height represents the quarter-bit distance of the data (iqr=q3-Q1).
The following steps: line segments extending upward and downward from the upper and lower boundaries of the box, respectively, represent the distribution range of the data, and typically a position 1.5 times the IQR from the box boundary is selected as the endpoint of the whisker.
Outliers: data points that refer to points above or below the endpoint of the whisker may be marked with circles or other symbols.
Through the box graph method, the central trend, the discrete degree and the abnormal condition of the data can be observed. By comparing box plots between different groups, their differences and similarities can be further appreciated. The box diagram method is commonly used in the fields of statistical analysis, quality control, data visualization and the like, and can help to understand the characteristics and structure of a data set more intuitively.
The upper limit value and the lower limit value are determined through the box diagram method, and the lower quartile Q1 and the upper quartile Q3 of the local subspace distance set are calculated. Calculating the quartile range IQR=Q3-Q1, and calculating the lower limit value of the group of dataUpper limit value->. And taking the distance between the lower limit value and the upper limit value as the passing of evaluation, and otherwise, not passing.
And when the evaluation result does not pass the evaluation, further screening the photovoltaic group strings in the local subspace is needed. Optionally, the step of screening the strings to be determined in the local subspace, and taking the screened local subspace as the target string set to be determined includes: generating a set to be removed according to the photovoltaic group strings with the distance larger than the upper limit value or smaller than the lower limit value in the local subspace, and generating a set to be screened according to the photovoltaic group strings except the set to be removed in the local subspace; determining the distance between each photovoltaic group string in the set to be screened and each photovoltaic group string in the set to be rejected; and if the number of the photovoltaic group strings with the distance smaller than the preset distance is larger than the preset number, eliminating the photovoltaic group strings corresponding to the distance smaller than the preset distance from the set to be screened, and taking the eliminated set to be screened as the target set to be judged.
For the local subspace, firstly screening out photovoltaic group strings which are larger than an upper limit value and smaller than a lower limit value as a set to be screened out, and the rest as a set to be screened out, calculating the distance between each photovoltaic group string in the set to be screened out and each photovoltaic group string in the set to be screened out, if the number of the distances smaller than the preset distance exceeds the preset number, adding the photovoltaic group strings to be screened out from the set to be screened out, and finally determining the rest set to be a target set to be determined, and calculating a subsequent local distance matrix. By the method, interference factors are reduced, data processing capacity is reduced, diagnosis efficiency is improved, and accuracy of a diagnosis result is improved.
Step S350, determining a local distance matrix according to the set of strings to be determined and the set of strings to be determined.
In this embodiment, the target to-be-determined group string set includes a positive local subspace set and a negative local subspace set, where the positive local subspace set corresponds to a positive local distance matrix, and the negative local subspace set corresponds to a negative local distance matrix.
Optionally, determining a positive subspace group string set and a negative subspace group string set according to the target to-be-determined group string set; determining a first distance between each photovoltaic string in the positive subspace string set and each string to be determined in the string set to be determined, and determining a second distance between each photovoltaic string in the negative subspace string set and each string to be determined in the string set to be determined; generating a positive local distance matrix according to each first distance, and generating a negative local distance matrix according to each second distance; the positive local distance matrix and the negative local distance matrix are determined as the local distance matrix.
Illustratively, after passing the local subspace evaluation, for each string to be determined in the target set of strings to be determined, the number of strings in its positive local subspace is noted as M 1 I.e., a set of positive subspace group strings; the number of strings of the local subspace is recorded as M 2 I.e.Negative subspace set strings. Time sequence number D num The following calculation is performed for each photovoltaic string of positive local subspace or each photovoltaic string of negative local subspace:
(1) Randomly selecting p groups of slave D num_rand To D num Time series data of D num_rand From 1 to D num And/2, arbitrarily selected day.
(2) Calculating a first DTW distance between each photovoltaic string in the positive subspace string set and each string to be determined in the string set to be determined, and calculating a second DTW distance between each photovoltaic string in the negative subspace string set and each string to be determined in the string set to be determined.
(3) M can be obtained by aligning subspace group string sets 1 * p positive local distance matrix, M can be obtained for negative subspace group string set 2 * A negative local distance matrix of p.
Alternatively, the present application calculates the distance using a dynamic time warping method (DTW). The DTW is a time similarity measurement method based on dynamic programming, and can effectively measure the similarity of time sequences by searching the best matching path between data in two arbitrary long time sequences, and has strong robustness to noise. The calculation steps comprise:
For two current sequences, P is m in length and Q is n in length, one is constructed firstThe rule of filling elements in the distance matrix M of (a) is as follows:
wherein->
At this time, a local distance matrix DM of each element of the two current timings is obtained.
The DTW path is calculated on the local distance matrix, and the shortest path from the position [0,0] of the local distance matrix to the position [ m-1, n-1] is calculated and is recorded as the DTW path. The shortest path is calculated by using dynamic programming in the calculation process, and the calculation process is as follows:
wherein, the method comprises the steps of, wherein,
wherein,the shortest path distance, DTW distance is:
k is the number of points on the shortest path.
Because the current time sequence data amount is more, if the distance is calculated by adopting the DTW directly, the data processing time is long, so that the current time sequence is subjected to cluster analysis by adopting a clustering algorithm, the data processing amount and the data dimension are reduced, and then the current time sequence is processed by adopting the DTW, and the data dimension and the data amount are reduced at the moment, so that the data processing efficiency can be effectively improved.
Step S360, determining the category of each string to be determined in the string set to be determined according to the local distance matrix.
In this embodiment, the local distance matrix includes a positive local distance matrix and a negative local distance matrix. After the local distance matrix is obtained, pseudo-distances are calculated. For a positive local distance matrix, there are positive pseudoranges. For a negative local distance matrix, there are negative pseudoranges.
Optionally, determining a first maximum distance of each row in the positive local distance matrix, generating a positive pseudo-distance vector of each row according to the first maximum distance of each row, and determining a positive pseudo-distance vector average according to each positive pseudo-distance vector; determining a second maximum distance of each row in the negative local distance matrix, generating a negative pseudo-distance vector of each row according to the second maximum distance of each row, and determining a negative pseudo-distance vector average value according to each negative pseudo-distance vector; and determining the class corresponding to the minimum mean value in the positive pseudo distance vector mean value and the negative pseudo distance vector mean value as the class of the group string to be determined in the group string set to be determined.
The method and the device adopt the technical scheme to realize the classification of each group string to be determined in the group string set to be determined. Therefore, the photovoltaic group strings which cannot be classified by the clustering algorithm can be effectively classified, and the fault diagnosis and the fault identification of the photovoltaic group strings of different types are realized through the technical scheme.
In a fifth embodiment of the present application, referring to fig. 5, the fault diagnosis method of the photovoltaic string of the present application includes:
(1) And (5) smoothing data. And (3) performing data smoothing by using a moving average method, setting a sliding window with the window size of 4, taking the average value in the window as a median value of the window, and executing multi-clustering after the sliding is completed.
(2) And (5) clustering. The specific process refers to the first embodiment and the second embodiment.
(3) And (5) calculating the clustering difference degree. The specific process refers to the third embodiment.
(4) Whether the degree of difference exceeds a threshold value is detected. If yes, executing local subspace generation, and if not, rejecting results. The specific process refers to the third embodiment.
(5) And (5) local subspace assessment. The specific process refers to the fourth embodiment.
(6) Whether the local subspace evaluates passing. If yes, calculating a local distance matrix. If not, carrying out local distance matrix calculation after subspace data screening. The specific process refers to the fourth embodiment.
(7) And (5) calculating pseudo distance. The specific process refers to the fourth embodiment.
(8) And outputting a result. The result includes the category to which each group string to be diagnosed belongs, i.e., whether each group string to be diagnosed is normal or faulty.
Embodiments of the present invention provide embodiments of a method of fault diagnosis of a photovoltaic string, it being noted that although a logical order is shown in the flowchart, in some cases the steps shown or described may be performed in an order different from that shown or described herein.
As shown in fig. 6, fig. 6 is a schematic structural diagram of a hardware operating environment of a fault diagnosis apparatus for a photovoltaic string according to an embodiment of the present invention. The fault diagnosis apparatus of a photovoltaic string may include: a processor 1001, such as a CPU, memory 1005, user interface 1003, network interface 1004, communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a display, an input unit such as a keyboard, and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface. The memory 1005 may be a high-speed RAM memory or a stable memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
It will be appreciated by those skilled in the art that the configuration of the fault diagnosis device for a photovoltaic string shown in fig. 6 does not constitute a definition of the fault diagnosis device for a photovoltaic string, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 6, an operating system, a network communication module, a user interface module, and a failure diagnosis program of the photovoltaic string may be included in the memory 1005 as one storage medium. The operating system is a program for managing and controlling hardware and software resources of the fault diagnosis device of the photovoltaic string, a fault diagnosis program of the photovoltaic string and other software or operation of the program.
In the fault diagnosis apparatus of the photovoltaic string shown in fig. 6, the user interface 1003 is mainly used for connecting terminals, with which data communication is performed; the network interface 1004 is mainly used for a background server and is in data communication with the background server; the processor 1001 may be used to invoke a fault diagnosis program for the photovoltaic string stored in the memory 1005.
In the present embodiment, a failure diagnosis apparatus of a photovoltaic string includes: a memory 1005, a processor 1001, and a fault diagnosis program for a string of photovoltaic groups stored on the memory and executable on the processor, wherein:
When the processor 1001 calls a failure diagnosis program of the photovoltaic string stored in the memory 1005, the following operations are performed:
determining characteristic values of the groups of strings to be diagnosed according to the current time sequence of the groups of strings to be diagnosed;
performing multiple clustering analysis on each group string to be diagnosed based on the characteristic values to obtain multiple clustering results, wherein each clustering result comprises two categories;
according to the group string current of the group string to be diagnosed in each category after each cluster analysis, determining a pseudo positive class and a pseudo negative class after each cluster analysis;
and determining a normal group string set according to each pseudo-positive class after each cluster analysis, and determining a fault group string set according to each pseudo-negative class after each cluster analysis.
When the processor 1001 calls the failure diagnosis program of the photovoltaic string stored in the memory 1005, the following operations are also performed:
combining all to-be-diagnosed group strings except the normal group string set and the fault group string set into a to-be-determined group string set;
determining local subspaces from the normal group string set and the fault group string set according to the characteristic values of each group string to be determined in the group string set to be determined;
Generating a local subspace distance set according to the distance between the local subspace and each string to be determined in the string set to be determined;
obtaining an evaluation result based on the local subspace distance set, and determining a target to-be-determined group string set according to the evaluation result;
determining a local distance matrix according to the to-be-determined group string set and the target to-be-determined group string set;
and determining the category of each group string to be determined in the group string set to be determined according to the local distance matrix.
Based on the same inventive concept, the embodiments of the present application further provide a computer readable storage medium, where the computer readable storage medium stores a fault diagnosis program of a photovoltaic string, where each step of the fault diagnosis method of the photovoltaic string described above is implemented when the fault diagnosis program of the photovoltaic string is executed by a processor, and the same technical effects can be achieved, so that repetition is avoided and no further description is given here.
Because the storage medium provided by the embodiment of the present application is a storage medium used for implementing the method of the embodiment of the present application, based on the method introduced by the embodiment of the present application, a person skilled in the art can understand the specific structure and the modification of the storage medium, and therefore, the description thereof is omitted herein. All storage media adopted by the method of the embodiment of the application belong to the scope of protection of the application.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, a television, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (14)

1. A method of fault diagnosis of a photovoltaic string, the method comprising:
determining characteristic values of the groups of strings to be diagnosed according to the current time sequence of the groups of strings to be diagnosed;
performing multiple clustering analysis on each group string to be diagnosed based on the characteristic values to obtain multiple clustering results, wherein each clustering result comprises two categories;
according to the group string current of the group string to be diagnosed in each category after each cluster analysis, determining a pseudo positive class and a pseudo negative class after each cluster analysis;
and determining a normal group string set according to each pseudo-positive class after each cluster analysis, and determining a fault group string set according to each pseudo-negative class after each cluster analysis.
2. The method for diagnosing faults of the photovoltaic string according to claim 1, wherein the step of performing a plurality of cluster analyses on each of the strings to be diagnosed based on the characteristic values to obtain a plurality of cluster results includes:
Randomly selecting two groups of strings to be diagnosed from all groups of strings to be diagnosed as clustering centers when in each clustering analysis, and regarding the groups of strings to be diagnosed except for the two groups of strings to be diagnosed as other groups of strings, wherein each clustering center represents a category;
determining distance values between the other groups of strings and each cluster center according to the characteristic values of the cluster centers and the characteristic values of the other groups of strings;
and determining the clustering result according to the distance value.
3. The method of diagnosing a failure of a string of photovoltaic modules according to claim 2, wherein the step of determining the clustering result from the distance value includes:
selecting a minimum distance value from all distance values between the other groups of strings and each cluster center;
determining the class of the clustering center corresponding to the minimum distance value;
and classifying the other groups of strings into the class of the clustering center corresponding to the minimum distance value to obtain the clustering result.
4. A method of fault diagnosis of a photovoltaic string according to any one of claims 1 to 3, wherein prior to the step of determining pseudo-positive and pseudo-negative classes after each cluster analysis from the string currents of the strings to be diagnosed in each class after each cluster analysis, the method further comprises:
Determining the cluster difference degree corresponding to each clustering analysis according to the characteristic value of the group string to be diagnosed in each category after each clustering analysis;
determining the inter-cluster difference degree when the inter-cluster difference degree is larger than the preset difference degree in all the inter-cluster difference degrees;
deleting the clustering result corresponding to the inter-cluster difference degree when the inter-cluster difference degree is larger than the preset difference degree, and taking the rest clustering result as the plurality of clustering results.
5. The method of diagnosing faults of the photovoltaic string according to claim 4, wherein the step of determining pseudo-positive and pseudo-negative classes after each cluster analysis based on string currents of the strings to be diagnosed in each class after each cluster analysis includes:
aiming at each clustering analysis, determining a group string current average value corresponding to each category after the clustering analysis according to the group string current of the group string to be diagnosed in each category after the clustering analysis;
selecting a maximum group of serial current average values and a minimum group of serial current average values from the group of serial current average values corresponding to each category after the clustering analysis;
and taking the maximum group of serial current average values as pseudo-positive classes after the clustering analysis, and taking the minimum group of serial current average values as pseudo-negative classes after the clustering analysis.
6. The method of diagnosing faults of the photovoltaic string according to claim 1, wherein the step of determining the normal string set from each pseudo-positive class after each cluster analysis and determining the fault string set from each pseudo-negative class after each cluster analysis includes:
and determining a normal group string set according to the intersection of each pseudo-positive class after each cluster analysis, and determining a fault group string set according to the intersection of each pseudo-negative class after each cluster analysis.
7. The method for diagnosing a failure of a photovoltaic string according to claim 1, further comprising:
combining all to-be-diagnosed group strings except the normal group string set and the fault group string set into a to-be-determined group string set;
determining local subspaces from the normal group string set and the fault group string set according to the characteristic values of each group string to be determined in the group string set to be determined;
generating a local subspace distance set according to the distance between the local subspace and each string to be determined in the string set to be determined;
obtaining an evaluation result based on the local subspace distance set, and determining a target to-be-determined group string set according to the evaluation result;
Determining a local distance matrix according to the to-be-determined group string set and the target to-be-determined group string set;
and determining the category of each group string to be determined in the group string set to be determined according to the local distance matrix.
8. The method for diagnosing a fault in a photovoltaic string according to claim 7, wherein the step of determining a local subspace from the normal string set and the fault string set based on the eigenvalues of the respective strings to be determined in the string set to be determined comprises:
determining the number and the dimension of the feature space;
mapping the characteristic values of each group string to be judged in the group string set to be judged to each characteristic space;
in each feature space, determining the distance between each string to be determined in the string set to be determined and each string to be determined in the normal string set and the fault string set according to the feature value of each string to be determined in the string set to be determined and the feature value of each string to be photovoltaic in the normal string set and the fault string set;
selecting a photovoltaic group string closest to each string to be determined in the string set to be determined from the normal string set to form a first neighbor string set, and selecting a photovoltaic group string closest to each string to be determined in the string set to be determined from the fault string set to form a second neighbor string set;
Determining a photovoltaic group string with the occurrence frequency larger than a preset frequency in the first neighbor group string set as a positive local subspace, and determining a photovoltaic group string with the occurrence frequency larger than the preset frequency in the second neighbor group string set as a negative local subspace, wherein the preset frequency is half of the number of the feature spaces;
and taking the positive local subspace and the negative local subspace as the local subspace.
9. The method for diagnosing a fault in a photovoltaic string according to claim 7, wherein the step of obtaining an evaluation result based on the local subspace distance set and determining a target to-be-determined string set according to the evaluation result comprises:
determining a lower limit value and an upper limit value based on the local subspace distance set;
when the distance between the upper limit value and the lower limit value exists in the local subspace distance set, determining that the evaluation result is that the photovoltaic group strings in the local subspace are combined into the target group string set to be determined through evaluation;
and when the distance existing in the local subspace distance set is larger than the upper limit value or smaller than the lower limit value, determining that the evaluation result is not passed, screening the photovoltaic string in the local subspace, and taking the screened local subspace as the target string set to be determined.
10. The method for diagnosing a failure of a photovoltaic string according to claim 9, wherein the step of screening the photovoltaic string in the local subspace based on the screened local subspace as the target to-be-determined string set comprises:
generating a set to be removed according to the photovoltaic group strings with the distance larger than the upper limit value or smaller than the lower limit value in the local subspace, and generating a set to be screened according to the photovoltaic group strings except the set to be removed in the local subspace;
determining the distance between each photovoltaic group string in the set to be screened and each photovoltaic group string in the set to be rejected;
and if the number of the photovoltaic group strings with the distance smaller than the preset distance is larger than the preset number, eliminating the photovoltaic group strings corresponding to the distance smaller than the preset distance from the set to be screened, and taking the eliminated set to be screened as the target set to be judged.
11. The method of diagnosing a fault in a photovoltaic string according to claim 7, wherein the step of determining a local distance matrix from the set of strings to be determined and the set of strings to be determined of the target includes:
determining a positive subspace group string set and a negative subspace group string set according to the target to-be-determined group string set;
Determining a first distance between each photovoltaic string in the positive subspace string set and each string to be determined in the string set to be determined, and determining a second distance between each photovoltaic string in the negative subspace string set and each string to be determined in the string set to be determined;
generating a positive local distance matrix according to each first distance, and generating a negative local distance matrix according to each second distance;
the positive local distance matrix and the negative local distance matrix are determined as the local distance matrix.
12. The method of claim 11, wherein determining the category of each string to be determined in the set of strings to be determined according to the local distance matrix comprises:
determining a first maximum distance of each row in the positive local distance matrix, generating a positive pseudo-distance vector of each row according to the first maximum distance of each row, and determining a positive pseudo-distance vector average value according to each positive pseudo-distance vector; the method comprises the steps of,
determining a second maximum distance of each row in the negative local distance matrix, generating a negative pseudo-distance vector of each row according to the second maximum distance of each row, and determining a negative pseudo-distance vector average value according to each negative pseudo-distance vector;
And determining the class corresponding to the minimum mean value in the positive pseudo distance vector mean value and the negative pseudo distance vector mean value as the class of the group string to be determined in the group string set to be determined.
13. A fault diagnosis apparatus of a photovoltaic string, characterized in that the fault diagnosis apparatus of a photovoltaic string comprises: memory, a processor and a fault diagnosis program for a photovoltaic string stored on the memory and running on the processor, which when executed by the processor, implements the steps of the fault diagnosis method for a photovoltaic string according to any one of claims 1-12.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a failure diagnosis program of a photovoltaic string, which when executed by a processor, implements the steps of the failure diagnosis method of a photovoltaic string according to any one of claims 1 to 12.
CN202311239309.1A 2023-09-22 2023-09-22 Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings Pending CN117240217A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311239309.1A CN117240217A (en) 2023-09-22 2023-09-22 Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311239309.1A CN117240217A (en) 2023-09-22 2023-09-22 Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings

Publications (1)

Publication Number Publication Date
CN117240217A true CN117240217A (en) 2023-12-15

Family

ID=89094514

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311239309.1A Pending CN117240217A (en) 2023-09-22 2023-09-22 Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings

Country Status (1)

Country Link
CN (1) CN117240217A (en)

Similar Documents

Publication Publication Date Title
CN111181939B (en) Network intrusion detection method and device based on ensemble learning
CN111833172A (en) Consumption credit fraud detection method and system based on isolated forest
US20230177347A1 (en) Method and apparatus for detecting defect pattern on wafer based on unsupervised learning
CN112911627B (en) Wireless network performance detection method, device and storage medium
Yang et al. A feature-metric-based affinity propagation technique for feature selection in hyperspectral image classification
CN112188532A (en) Training method of network anomaly detection model, network detection method and device
US11321633B2 (en) Method of classifying defects in a specimen semiconductor examination and system thereof
CN116167010B (en) Rapid identification method for abnormal events of power system with intelligent transfer learning capability
CN110717540A (en) Method and device for identifying new radar source individuals
CN112149758A (en) Hyperspectral open set classification method based on Euclidean distance and deep learning
CN111046977A (en) Data preprocessing method based on EM algorithm and KNN algorithm
CN116453438A (en) Display screen parameter detection method, device, equipment and storage medium
CN116075733A (en) Battery management system for classifying battery modules
Ji et al. A divisive hierarchical clustering approach to hyperspectral band selection
CN113516173B (en) Evaluation method for static and dynamic interference of whole vehicle based on random forest and decision tree
CN106951924B (en) Seismic coherence body image fault automatic identification method and system based on AdaBoost algorithm
CN110222756B (en) Hyperspectral complex background-oriented iterative clustering anomaly detection method
CN111145314A (en) Method for extracting place name symbol of scanning electronic map by combining place name labeling
CN116188445A (en) Product surface defect detection and positioning method and device and terminal equipment
CN117240217A (en) Method, apparatus and computer readable storage medium for diagnosing faults of photovoltaic strings
CN115934699A (en) Abnormal data screening method and device, electronic equipment and storage medium
CN115081514A (en) Industrial equipment fault identification method under data imbalance condition
KR100581673B1 (en) Data Classification Method
CN111597934A (en) System and method for processing training data for statistical applications
CN112801028A (en) Spectrum and space map hyperspectral image classification method based on inductive representation learning network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination