CN115205233A

CN115205233A - Photovoltaic surface defect identification method and system based on end-to-end architecture

Info

Publication number: CN115205233A
Application number: CN202210726689.0A
Authority: CN
Inventors: 田强; 王金友; 陈晓东; 田明国; 陈文佼; 胡丽; 马海春; 鞠若彬; 任磊; 姜文; 张帆
Original assignee: State Grid Corp of China SGCC; Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; Weifang Power Supply Co of State Grid Shandong Electric Power Co Ltd
Priority date: 2022-06-24
Filing date: 2022-06-24
Publication date: 2022-10-18

Abstract

The invention relates to the technical field of photovoltaic surface defect identification, and provides a photovoltaic surface defect identification method and a photovoltaic surface defect identification system based on an end-to-end architecture, wherein the method comprises the following steps: acquiring a photovoltaic surface image; adopting a photovoltaic defect identification model based on an end-to-end architecture to carry out defect identification on the photovoltaic surface image to obtain a defect area of the photovoltaic surface image; the photovoltaic defect identification model based on the end-to-end architecture comprises a feature extraction layer, a model encoder, a model decoder and a linear layer which are sequentially connected; the model encoder processes the image feature map extracted by the feature extraction layer by utilizing a multi-scale deformable attention mechanism to obtain a multi-scale feature map; the model decoder processes the multi-scale feature map by using a self-attention mechanism and a deformable cross-attention mechanism, and inputs the obtained decoding features and reference points into the linear layer. The method not only improves the identification precision of the sparse photovoltaic defects, but also overcomes the defect of slow convergence of the conventional defect identification model.

Description

Photovoltaic surface defect identification method and system based on end-to-end architecture

Technical Field

The invention belongs to the technical field of photovoltaic surface defect identification, and particularly relates to a photovoltaic surface defect identification method and system based on an end-to-end architecture.

Background

The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.

The current photovoltaic surface defect identification modes mainly include manual visual inspection and CNN-based identification algorithms.

If the mode of manual visual inspection is adopted to carry out photovoltaic surface defect identification and recognition, higher labor cost can be caused, and meanwhile, the false detection rate of manual visual inspection is higher aiming at the defects which cannot be easily recognized in part, so that more defects exist.

The defect identification algorithm based on the CNN further overcomes the defects of the traditional manual visual inspection, and the defect target identification is pushed to the field of artificial intelligence and deep learning. However, the recognition algorithm based on the CNN has a large improvement space in the aspects of defect feature extraction, time-consuming model training, recognition accuracy and the like, and particularly needs to manually extract features.

Disclosure of Invention

In order to solve the technical problems in the background art, the invention provides a photovoltaic surface defect identification method and system based on an end-to-end architecture, which not only improve the identification precision of sparse photovoltaic defects, but also overcome the defect of slow convergence of the conventional defect identification model.

In order to achieve the purpose, the invention adopts the following technical scheme:

the invention provides a photovoltaic surface defect identification method based on an end-to-end architecture, which comprises the following steps:

acquiring a photovoltaic surface image;

adopting a photovoltaic defect identification model based on an end-to-end architecture to carry out defect identification on the photovoltaic surface image to obtain a defect area of the photovoltaic surface image;

the photovoltaic defect identification model based on the end-to-end architecture comprises a feature extraction layer, a model encoder, a model decoder and a linear layer which are sequentially connected; the model encoder processes the image feature map extracted by the feature extraction layer by utilizing a multi-scale deformable attention mechanism to obtain a multi-scale feature map; and the model decoder processes the multi-scale characteristic map by using a self-attention mechanism and a deformable cross-attention mechanism, and inputs the obtained decoding characteristics and reference points into the linear layer.

Further, the size and resolution of the photovoltaic surface image is processed before defect identification of the photovoltaic surface image.

Furthermore, the model encoder performs dimension compression on the multi-scale characteristic diagram, then adds the multi-scale characteristic diagram to the position code to obtain serialized data, and then adds the serialized data to the scale level embedding.

Furthermore, the model encoder adopts a position encoding mode based on a sine function and a cosine function.

Further, the model decoder learns the two-dimensional coordinates of the reference point through the linear layer and the activation function for each target feature pixel point, and then performs coordinate regression operation by applying the deformable cross attention mechanism.

A second aspect of the present invention provides a photovoltaic surface defect identification system based on an end-to-end architecture, comprising:

an image acquisition module configured to: acquiring a photovoltaic surface image;

a defect identification module configured to: adopting a photovoltaic defect identification model based on an end-to-end architecture to carry out defect identification on the photovoltaic surface image to obtain a defect area of the photovoltaic surface image;

the photovoltaic defect identification model based on the end-to-end architecture comprises a feature extraction layer, a model encoder, a model decoder and a linear layer which are sequentially connected; the model encoder utilizes a multi-scale deformable attention mechanism to process the image feature map extracted by the feature extraction layer to obtain a multi-scale feature map; and the model decoder processes the multi-scale characteristic map by using a self-attention mechanism and a deformable cross-attention mechanism, and inputs the obtained decoding characteristics and reference points into the linear layer.

Further, a pre-processing module is included that is configured to: the size and resolution of the photovoltaic surface image is processed before defect identification of the photovoltaic surface image.

A third aspect of the present invention provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in a photovoltaic surface defect identification method based on an end-to-end architecture as described above.

A fourth aspect of the present invention provides a computer device, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the method for identifying defects on a photovoltaic surface based on an end-to-end architecture.

Compared with the prior art, the invention has the beneficial effects that:

the invention provides a photovoltaic surface defect identification method based on an end-to-end framework, which takes integral photovoltaic defect identification as a set prediction problem, utilizes ConvNext to extract basic characteristics in the process of identifying the end-to-end framework, sends the basic characteristics to an encoder and a decoder to perform relationship modeling, takes all characteristics in a defect image as a set, and an end-to-end identifier needs to predict the set in the defect image, does not apply independent prediction of various defects in the traditional sense, but takes all characteristics on the defect image as a prediction target from a global relationship, removes artificial components such as space anchor points, non-maximum inhibition and the like, predicts a final identification set in a parallel mode by deducing the relationship between the target identification target and the global image context, and introduces a set prediction mode into the process of identifying the end-to-end framework to improve the identification precision of sparse photovoltaic defects.

The invention provides a photovoltaic surface defect identification method based on an end-to-end architecture, which draws the main idea of DCN (distributed component network) into reference, constructs a deformable attention mechanism, reduces the interactive calculation times of characteristic pixels and other unnecessary characteristic pixels, and overcomes the defect of slow convergence of the conventional defect identification model.

The invention provides a photovoltaic surface defect identification method based on an end-to-end architecture, which aims at the problems of large calculation amount, high space complexity and the like of the existing defect identification system.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.

Fig. 1 is a flowchart of a photovoltaic surface defect identification method based on an end-to-end architecture according to an embodiment of the present invention;

fig. 2 is a diagram of an end-to-end architecture-based pv defect identification model architecture according to an embodiment of the present invention.

Detailed Description

The invention is further described with reference to the following figures and examples.

It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example one

The embodiment provides a photovoltaic surface defect identification method based on an end-to-end architecture, as shown in fig. 1, specifically including the following steps:

step 1, data acquisition: and acquiring a photovoltaic surface image.

Real-time video of a photovoltaic surface (photovoltaic panel surface) is collected through an industrial camera with high resolution, and image collection is carried out on each frame of video.

The camera is parallel to the photovoltaic surface to collect videos, data preprocessing is carried out on collected video transmission, and a data set is manufactured.

Step 2, data preprocessing: and carrying out format conversion on the photovoltaic surface image.

If the photovoltaic surface image is the photovoltaic surface image to be identified, a format conversion operation of data enhancement (processing the aspects of the size, the dimension, the resolution and the like of the image to enable the image to meet the requirements of model training) is required.

If the photovoltaic surface image is a photovoltaic surface image for forming a training set, aiming at the image collected by the camera, format conversion operations such as data enhancement, feature marking, label evaluation, label setting and the like are carried out on the collected photovoltaic surface video frame (image) according to the model requirement.

Step 3, training a photovoltaic defect recognition model based on an end-to-end architecture by adopting a training set; and performing defect identification on the preprocessed photovoltaic surface image to be identified by adopting a trained photovoltaic defect identification model based on an end-to-end architecture to obtain a defect region (a prediction frame and a defect characteristic score) of the photovoltaic surface image.

As shown in fig. 2, the photovoltaic defect identification model based on the end-to-end architecture includes a feature extraction layer, a model encoder, a model decoder, a linear layer and a normalization layer, which are connected in sequence.

The model encoder comprises 6 encoders which are connected in sequence, and each encoder sequentially comprises a multi-scale deformable attention block, a residual connecting layer and a normalization layer, a feedforward propagation layer and a residual connecting layer and a normalization layer from bottom to top. Stacking 6 encoders connected in sequence, and when each layer of encoder encodes, taking the encoding characteristics output by the encoder of the previous layer as the input of the encoder of the next layer, combining the input with position embedding as a query vector, and taking the input as a value vector after linear transformation.

The model decoder is also composed of 6 decoders which are connected in sequence, and each decoder is sequentially provided with a multi-head self-attention block, a first residual connection layer and a standardization layer, a multi-scale deformable cross-attention block, a second residual connection layer and a standardization layer, a feedforward propagation layer, a third residual connection layer and a standardization layer from bottom to top. The process of decoding of each layer decoder can be understood as a self-attention block + cross-attention block + feed forward propagation layer, with the target of the next layer decoder input embedded as the decoding feature of the previous layer output.

Processing the image feature map extracted by the feature extraction layer by using a multi-scale deformable attention mechanism by using a model encoder to obtain a multi-scale feature map; and the model decoder processes the multi-scale characteristic graph by using a self-attention mechanism and a multi-scale deformable cross-attention mechanism, and inputs the obtained decoding characteristics and the reference point into a linear layer and a normalization layer to finally obtain a prediction result.

And 301, the feature extraction layer obtains an image feature map by using network layers of 4 ConvNext blocks in the ConvNext network.

The photovoltaic surface defect data set is firstly subjected to feature extraction through a ConvNext network, and an image feature map of C × H × W is extracted from an original image of 3 × H × W. Where C is the number of channels of the image, and H and W represent the height and width of the image, respectively.

Step 302, model coding.

The model encoder encodes pixels and sampling points of an image feature map output by ConvNext (namely, position encoding is carried out on pixel points according to the input image feature map), 4 reference points are searched in each target feature pixel point according to network learning, and simultaneously scale information of different levels of pixel points is introduced through scale level embedding, wherein the scale level embedding can be randomly initialized and can follow the network for learning. Meanwhile, interactive calculation is carried out on the characteristic pixel points and partial sampling points based on the sampling region through a deformable attention mechanism (namely, each target characteristic is attended by using a deformable attention block, information of different scales is fused and exchanged, interactive calculation on global characteristic pixel points is reduced, and therefore local and sparse attention weight is obtained in a linear layer), and a multi-scale characteristic diagram is obtained.

Defect feature map (image feature map) for a given input

Let q index a feature z with photovoltaic surface defect content _q And a two-dimensional reference point p _q The deformable attention feature calculation method comprises the following steps:

wherein M represents the number of attention heads, K represents the number of defect feature pixels sampled, K represents the total number of defect feature pixels sampled, Δ p _mqk And A _mqk Respectively representing the sampling offset and the attention weight of the kth sampling point in the mth attention head,

and

all refer to a learnable weight, default C _v = C/M, C and C _v Respectively referring to the feature map dimension and the value matrix dimension obtained by sampling interpolation.

And processing the image feature map extracted from the ConvNext network by using a deformable attention mechanism to obtain a multi-scale feature map.

In this embodiment, a multi-scale deformable attention mechanism is adopted to process the image feature map to obtain feature maps of multiple scales.

Firstly, obtaining multi-scale photovoltaic defect characteristic mapping in an image characteristic diagram by using a multi-scale deformable attention block through a sampling interpolation method

L refers to the number of layers, x, of the multi-scale feature layer in the multi-scale deformable attention block ^l The characteristic representing the l-th layer, that is, a corresponding point is sampled in each characteristic layer in the sampling process. To be provided with

As normalized coordinates of the reference point for each defect query element q, then apply the multi-scale deformable attention mechanism in the model encoder:

wherein z is _q I.e., Q in fig. 2, is generated by x via a linear transformation; each defect query element q samples K points in each feature layer; q and K respectively represent indexes of Q and K, K represents a sampling point, and Q represents a query element; w _m Then the result after applying V to attention is linearly transformed to obtain the output result of different attention heads; w' _m For mixing x _k The transformation is V. m denotes the attention head, l denotes the input defect feature level, Δ p _mlqk And A _mlqk The sample offset and attention weight of the kth sample point representing the ith feature level and the mth attention respectively,

meaning that the coordinates will be normalized

Rescaling to the level of the l-th level.

Thus, by using the multi-scale deformable attention mechanism, a plurality of defect feature sampling points can be focused from a plurality of input defect features, the defect feature pixels can be focused more flexibly, the attention to other irrelevant pixels is reduced, and the defect feature pixels are endowed with higher feature weight.

For the multi-scale feature map extracted by using the multi-scale deformable attention mechanism, an encoder firstly performs dimension compression on the multi-scale feature map by using 1 × 1 convolution, and reduces the dimension of the C × H × W feature map to d × H × W, namely, compresses a C-dimensional channel into a d-dimensional channel to obtain a new d-dimensional feature map.

The multi-scale deformable attention block converts feature map information after dimension compression into serialized data, mainly refers to serialized data obtained by adding the feature map information and position information (position coding) of pixel points, and then the serialized data is embedded and added with scale levels to serve as output of the multi-scale deformable attention block. The position coding mode based on sine function and cosine function is adopted to carry out position coding on the characteristic diagram information, and the formula is as follows:

the PE refers to position embedding, is a two-dimensional matrix and is mainly used for storing the relative or absolute position of a characteristic pixel in a sequence; the defect characteristic map after dimension compression is

D-dimensional embedded representation of (1); d _model Refers to the dimension of the location embedding; position-embedded matrix with same shape for position coding

The output is X + pos, wherein X is the embedding of the characteristic information, pos is the line where the characteristic pixel (image pixel) is located, and the two are added to be used as the representation vector of the characteristic information and used as the output of the encoder; pos represents the row of the image pixel, i represents the column corresponding to the row of the pixel, and both together form a PE (position embedding). PE (polyethylene) _(pos,2i) And PE _(pos,2i+1) Respectively represent the defect feature elements on the 2i th and 2i +1 th columns of the rows of the matrix pos. The rows represent the positions of the defective elements in the sequence, while the columns represent the different dimensions of the position coding. The scale level embedding is then randomly initialized and trained along with the network.

In this way, the serialized data can reflect location information of the defect feature. And the multi-scale deformable attention block can exchange information among feature maps with different resolutions, so that the feature maps with different resolutions can be processed by using the multi-scale deformable attention mechanism, and the input and the output of the encoder are the multi-scale feature maps with the same resolution. Meanwhile, the target pixel and the actual pixel in the encoder are pixels from the feature map, and for each target pixel, the selection of the reference point is generally performed based on the sampling position of the target pixel, or the reference point is set as the reference point.

And step 303, model decoding.

The model decoder processes the input multi-scale feature map by utilizing a self attention mechanism and a deformable cross attention mechanism of the model decoder, calculates attention weights, enables keys of positions of each query and sampling part to calculate the attention weights, samples and interpolates the positions to obtain values, and loads the attention weights on the corresponding values. The two-dimensional coordinates of the corresponding reference points are learned by the linear layer and the activation function aiming at each target feature pixel point, and then the regression operation of the relative coordinates is carried out by applying a deformable cross attention mechanism, so that the convergence of the model is accelerated.

The attention of the model decoder still adopts a self-attention module of a Transformer, and for each target query, the self-attention module of the decoder predicts a two-dimensional normalized reference point for target features and further optimizes and corrects the target features. And finally, outputting the decoding characteristics and the reference point to obtain a prediction result.

After the model encoder processes the input sequence, the output of the top layer encoder is transformed into a set of attention vectors K and V to focus primarily on the relevant feature pixels of the photovoltaic surface defects during decoding. In model decoders, there are multi-scale deformable cross-attention mechanisms as well as self-attention mechanisms. The feature mapping output by the encoder is input to a multi-scale deformable cross attention mechanism as a key element, and the target query extracts features from the feature mapping. In the self-attention mechanism, target queries containing key elements are computed interactively, and for each target query, two-dimensional standardized coordinates of a reference point are generally predicted by learnable linear projection and an activation function to embed the target query. An iterative bounding box refinement mechanism is also introduced, each decoder layer refining the bounding box based on the prediction from the previous layer.

The self-attention mechanism of the multi-headed self-attention block in the model decoder can be expressed as:

head _i ＝Attention(QW _i ^Q ,KW _i ^K ,VW _i ^V ),i＝1,2,3.......,h

MultiHead(Q,K,V)＝Concat(head ₁ ,head ₂ ,......head _h )W ^O

h refers to the number of the attention heads in the model decoder, Q, K and V are three different feature inputs which are linearly projected into different feature subspaces, and Multihead refers to the final output obtained by splicing multiple groups of information related to defect features by a multi-head self-attention block. Concat refers to the concatenation of multiple sets of Attention values calculated by Attention. W ^O Q, K and V all refer to projections of a defect information parameter matrix. As shown in fig. 1, Q, K, V of the model decoder refers to the same operation as in the encoder position encoding, i.e. the image information is divided into three different feature input matrices; but here the corresponding image information refers to the real pixels of the defect feature; the Q, K, V operation in the encoder is the image feature map obtained by convnext.

Then, the residual connecting layer and the standardized layer are passed through, and the main function of the residual connecting layer is to solve the problem of multi-layer network training, so that the network pays attention to the place with difference, namely characteristic information. The normalization layer accelerates convergence by converting the input.

And after the output of the normalization layer, entering a multi-scale cross attention block, and interacting the output information of the encoder and the target query information in the process, wherein the target query information is output from the multi-head self-attention block through the normalization layer, and the value vector is obtained by linear transformation of the features coded by the encoder.

Finally, as similar to an encoder, a certain number of decoders are stacked by sequentially passing through structures such as a residual connection layer and a normalization layer, a feedforward propagation layer, a residual connection layer and a normalization layer, and the decoding process of each layer can be understood as a self-attention block + a cross-attention block + a feedforward propagation layer, and a target input by a next layer is embedded into a decoding feature output by a previous layer.

And step 304, identifying defects on the photovoltaic surface.

And identifying the prediction box according to the output in the model decoder based on the photovoltaic defect identification model of the end-to-end architecture.

And after the decoder decodes the characteristics and outputs a reference point, the output is obtained through a linear layer and a normalization layer, and finally the defect characteristics are predicted.

y＝xA ^T +b

Wherein A is attention weight of defect features, b is sampling offset, and finally a prediction result is obtained.

It should be noted that the multi-scale deformable attention block mainly extracts image features around a reference point, so that the detection head prediction bounding box is applied as a sampling offset in the invention, the learned attention is closely related to the predicted bounding box, then the attention weight is calculated and aggregated with the sampling offset to obtain an aggregated sampling value, and finally the features are decoded.

And finally, predicting the photovoltaic surface defects by a defect prediction module according to the output of the decoder to obtain a prediction frame and a defect characteristic score.

The invention mainly aims at the problems of sparse photovoltaic surface defects, limited characteristic resolution and the like, provides an improved photovoltaic surface defect identification model based on an end-to-end architecture, the integral photovoltaic defect identification is regarded as an aggregate prediction problem, in the process of identifying the end-to-end architecture, basic characteristics are extracted by using ConvNext and are sent to an encoder and a decoder to perform relational modeling, all characteristics in a defect image are regarded as an aggregate, and an end-to-end identifier needs to predict the aggregate in the defect image. In the overall sense, the method does not apply various defect individual predictions in the traditional sense, but uses all the characteristics on a defect picture as prediction targets from the global relation, removes artificial components such as space anchor points and non-maximum suppression, predicts a final recognition set in a parallel mode by deducing the relation between a target recognition object and the global image context, and introduces a set prediction mode into an end-to-end architecture recognition process, so that the recognition accuracy of the sparse photovoltaic defects is improved. Secondly, the invention uses the main idea of DCN (Deformable relational Networks) for reference, constructs a Deformable attention mechanism, reduces the interactive calculation times of the characteristic pixels and other unnecessary characteristic pixels, and overcomes the defect of slow convergence of the conventional defect identification model. Finally, aiming at the problems of large calculated amount, high space complexity and the like of the existing defect recognition model, the invention utilizes scale-level embedding to distinguish different feature layers, so that feature points of the same feature layer correspond to the same scale-level embedding, and meanwhile, the scale-level embedding adopts a random initialization mode and is trained and learned along with a network, thereby realizing the attention to multi-scale features.

The photovoltaic defect identification model based on the end-to-end architecture applies the idea of self-supervision learning, is inspired by a two-stage detector, and generates a candidate region by a deformable end-to-end Transformer variant in the first stage. And in the second stage, the candidate region generated in the first stage is sent to a decoder, and the candidate region is used as a query feature to further optimize and correct, so that a prediction result is obtained. The output of the model encoder is used as the input of the model decoder, and the operations such as encoding and decoding are carried out on the multi-scale feature map obtained from the defective image, so that the feature pixels of the multi-scale feature map can be focused and optimized further, and the prediction result is obtained. Secondly, the deformable attention block used by the method does not pay attention to the relation between the feature pixel and the global pixel any more, but carries out interactive calculation on the feature pixel and the pixel point of the related sampling region, and effectively reduces repeated interactive calculation among a plurality of unrelated pixel points. The resulting weights for the multi-scale deformable attention block are given by:

wherein m is attention head, W _m And W' _m As a learnable weight, A _mqk To pay attention toForce weight, z _q 、x _k Is a characteristic diagram.

In addition, in the conventional photovoltaic defect identification process, when part of the neural networks are initialized, the same attention weight is generally given to the characteristic pixels, so that sparse characteristics are easy to appear. Meanwhile, during model coding, the attention weight and the number of the feature pixel points form a square-level relation, so that the features with high resolution are difficult to process, and the recognition effect on small targets needs to be improved. The deformable attention mechanism used by the invention samples each attention head, and then performs characteristic interaction by the query characteristic and the real characteristic information based on the sampling position, so as to really focus on the position or the area of the defect, thereby effectively improving the attention degree on identifying the defect.

Compared with the original end-to-end architecture model, the overall model structure is greatly improved in the aspects of convergence speed, calculation memory and the like, unnecessary calculation cost is reduced for model training, the network is greatly optimized in the aspects of precision and efficiency, and the advantages of the end-to-end architecture are fully exerted.

The photovoltaic defect identification based on the end-to-end architecture pushes defect identification and recognition to a pixel level, and after the photovoltaic defect identification is put into use, partial sparse photovoltaic defects are mainly identified to help workers find defects on the photovoltaic surface in time. Meanwhile, in the aspect of position coding of an input data set, scale-level embedded random initialization and learning together with a network can enable a model to learn continuously and improve the self-recognition effect continuously.

Compared with the traditional methods such as manual visual inspection and CNN, the photovoltaic defect identification system based on the end-to-end architecture takes the collected defect image or video processing as input, utilizes a network to perform preliminary feature extraction, and converts the preliminary feature extraction into a multi-scale feature map. And the photovoltaic surface defect is identified to a pixel level while the cost of manual visual inspection is reduced, and the accuracy and the speed of an end-to-end architecture on resolution and small object limited target identification are further improved. In addition, compared with the existing models in defect identification, the photovoltaic defect identification model based on the end-to-end architecture is more friendly in terms of calculation memory and spatial resolution, and is mainly characterized in that the existing models endow all pixels with the same weight by using an attention mechanism, so that the calculation complexity is increased. The invention uses deformable thought for reference, carries out interpolation training on the characteristic pixels, changes all characteristic pixel interactive calculation into pixel interactive calculation based on a sampling region, reduces the complexity of calculation, can pay attention to the multi-scale characteristic graph, realizes automatic fusion of multi-scale characteristic information, and carries out regression operation on the frame by utilizing a simple and effective iterative boundary frame refining mechanism, thereby improving the model precision. The final model can realize classification of the photovoltaic surface defects and exact positioning of the pixel level without an external sensor.

Example two

The embodiment provides a photovoltaic surface defect identification system based on an end-to-end architecture, which specifically comprises the following modules:

A pre-processing module configured to: the size and resolution of the photovoltaic surface image is processed before defect identification of the photovoltaic surface image.

The model encoder performs dimensionality compression on the multi-scale characteristic diagram, and then adds the multi-scale characteristic diagram to position coding to obtain serialized data, and then the serialized data is added with scale level embedding.

It should be noted that, each module in the present embodiment corresponds to each step in the first embodiment one to one, and the specific implementation process is the same, which is not described herein again.

EXAMPLE III

The present embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps in a photovoltaic surface defect identification method based on an end-to-end architecture as described in the first embodiment above.

Example four

The embodiment provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the program to implement the steps in the photovoltaic surface defect identification method based on the end-to-end architecture as described in the first embodiment.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A photovoltaic surface defect identification method based on an end-to-end architecture is characterized by comprising the following steps:

acquiring a photovoltaic surface image;

the photovoltaic defect identification model based on the end-to-end architecture comprises a feature extraction layer, a model encoder, a model decoder and a linear layer which are sequentially connected; the model encoder utilizes a multi-scale deformable attention mechanism to process the image feature map extracted by the feature extraction layer to obtain a multi-scale feature map; the model decoder processes the multi-scale feature map by using a self-attention mechanism and a deformable cross-attention mechanism, and inputs the obtained decoding features and reference points into the linear layer.

2. The method for identifying defects on photovoltaic surfaces based on end-to-end architecture as claimed in claim 1, wherein the size and resolution of the photovoltaic surface image are processed before the defect identification is performed on the photovoltaic surface image.

3. The method for identifying the photovoltaic surface defects based on the end-to-end architecture as claimed in claim 1, wherein the model encoder performs dimension compression on the multi-scale feature map, adds the multi-scale feature map to the position code to obtain serialized data, and then adds the serialized data to the scale-level embedding.

4. The photovoltaic surface defect identification method based on the end-to-end architecture as claimed in claim 1, wherein the model encoder adopts a position encoding mode based on a sine function and a cosine function.

5. The end-to-end architecture-based photovoltaic surface defect identification method of claim 1, wherein the model decoder learns two-dimensional coordinates of a reference point from a linear layer and an activation function for each target feature pixel point, and then performs a coordinate regression operation by applying the deformable cross attention mechanism.

6. A photovoltaic surface defect identification system based on an end-to-end architecture, comprising:

7. The end-to-end architecture-based photovoltaic surface defect identification system of claim 6, further comprising a pre-processing module configured to: the size and resolution of the photovoltaic surface image is processed before defect identification of the photovoltaic surface image.

8. The end-to-end architecture-based photovoltaic surface defect identification system of claim 6, wherein the model encoder performs dimension compression on the multi-scale feature map, adds the multi-scale feature map to the position code to obtain serialized data, and then adds the serialized data to the scale-level embedding.

9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of a method for identifying defects of a photovoltaic surface based on an end-to-end architecture as claimed in any one of claims 1 to 5.

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of a method for identifying defects on a photovoltaic surface based on an end-to-end architecture as claimed in any one of claims 1 to 5 when executing the program.