CN117314892A

CN117314892A - Incremental learning-based continuous optimization method for solar cell defect detection

Info

Publication number: CN117314892A
Application number: CN202311584711.3A
Authority: CN
Inventors: 杨柳; 蒋诗婕; 龙军; 罗跃逸; 吴忠泽
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-11-27
Filing date: 2023-11-27
Publication date: 2023-12-29
Anticipated expiration: 2043-11-27
Also published as: CN117314892B

Abstract

The invention relates to the field of computer vision technology, and specifically discloses a continuous optimization method for solar cell defect detection based on incremental learning, which constructs a defect detection model and continuously optimizes and updates the defect detection model. Specifically, it continuously inputs defect data and updates features. The extractor, auxiliary classifier, feature fusion and classifier are incrementally trained using the gradient descent method to construct an example set of unknown defect categories and adjust the example set of known defect categories. The feature extractor is trained based on the geometric center of the feature extractor. Pruning enables defect detection that greatly reduces catastrophic forgetting and continuously optimizes both old and new task detection. The advantage is that a small number of category data sets can start the defect detection process, and only a very small sample set of old data sets and an input training set need to be stored at each stage, ensuring continuous optimization and update of the defect detection model, while reducing storage and computing loads. .

Description

A continuous optimization method for solar cell defect detection based on incremental learning

技术领域Technical field

本发明涉及计算机视觉技术领域，具体涉及一种基于增量学习的太阳能电池缺陷检测持续优化方法。The invention relates to the field of computer vision technology, and in particular to a continuous optimization method for solar cell defect detection based on incremental learning.

背景技术Background technique

在新能源和节能政策推动下，太阳能电池得到了大规模应用和生产。但制造工艺发生故障时将导致所生产电池具有一定缺陷，对生产过程中出现的缺陷进行检测分析并及时调整对应故障工艺可以减少损失，因此实现实时精准地检测发现电池片缺陷类别从而调整对应故障工艺是非常有必要的。Driven by new energy and energy-saving policies, solar cells have been applied and produced on a large scale. However, when the manufacturing process fails, the batteries produced will have certain defects. Detection and analysis of defects occurring during the production process and timely adjustment of the corresponding faulty process can reduce losses. Therefore, real-time and accurate detection of battery chip defect categories can be achieved to adjust the corresponding faults. Craftsmanship is very necessary.

批量生产太阳能电池的过程中，采集其缺陷数据，会发现缺陷数据呈长尾分布，部分类别（头部缺陷类）的类别数较少但每个类别的样本量较多，尾部缺陷类则相反，从而导致包含所有缺陷类别的均衡数据集较难一次性获得，只能获得少量的缺陷类别数据集，其余缺陷类别暂时无法一次性学习到；受实际生产因素影响，缺陷表现形式可能发生变化，导致模型学习过的已知缺陷类别具有新特征值；制造工艺的调整也有可能导致新的缺陷类别产生。因此可以将少量的缺陷类别数据集作为基础类别训练集，训练一个基础神经网络模型，在保留旧知识的基础上，不断且分阶段地训练学习持续到来的未知缺陷类别数据以及含有新特征值的已知缺陷类别，以持续优化检测能力。In the process of mass production of solar cells, if we collect their defect data, we will find that the defect data has a long-tail distribution. Some categories (head defect categories) have fewer categories but more samples for each category. The opposite is true for the tail defect categories. , which makes it difficult to obtain a balanced data set containing all defect categories at one time. Only a small number of defect category data sets can be obtained, and the remaining defect categories cannot be learned at one time. Affected by actual production factors, defect manifestations may change. As a result, known defect categories learned by the model have new feature values; adjustments to the manufacturing process may also lead to the generation of new defect categories. Therefore, a small amount of defect category data sets can be used as a basic category training set to train a basic neural network model. On the basis of retaining old knowledge, we can continuously and in stages train and learn the data of unknown defect categories that continue to arrive and the data containing new feature values. Known defect categories to continuously optimize inspection capabilities.

当前太阳能电池的智能缺陷检测仍然采用传统的离线学习方式，其在训练阶段必须依赖于完整的数据集，且只能从无到有地训练一个模型，训练耗时长。离线学习方式在学习连续到来的新数据并进行模型持续优化时无法及时适应数据变化，因此运用于当前大规模、高维度、不断变化的数据集学习场景时，面临更新滞后和计算资源消耗大等问题。Currently, the intelligent defect detection of solar cells still adopts the traditional offline learning method, which must rely on a complete data set during the training phase, and can only train a model from scratch, which takes a long time to train. Offline learning methods cannot adapt to data changes in time when learning new data that continuously arrives and continuously optimizing the model. Therefore, when applied to current large-scale, high-dimensional, and constantly changing data set learning scenarios, they face update lag and high consumption of computing resources. question.

综上所述，为解决目前离线学习方法进行模型持续优化时计算资源消耗大和更新滞后的问题，急需一种基于增量学习的太阳能电池缺陷检测持续优化方法。In summary, in order to solve the problems of high computing resource consumption and lag in updates when the current offline learning method is used to continuously optimize the model, a continuous optimization method for solar cell defect detection based on incremental learning is urgently needed.

发明内容Contents of the invention

本发明目的在于提供一种基于增量学习的太阳能电池缺陷检测持续优化方法，具体技术方案如下：The purpose of the present invention is to provide a continuous optimization method for solar cell defect detection based on incremental learning. The specific technical solutions are as follows:

一种基于增量学习的太阳能电池缺陷检测持续优化方法，包括如下步骤：A continuous optimization method for solar cell defect detection based on incremental learning, including the following steps:

步骤S1，缺陷数据持续输入：Step S1, defect data continues to be input:

输入范例集和缺陷数据，所述范例集为旧数据集的子集，所述缺陷数据包括未知类别数据和含有新特征值的已知类别数据，范例集和缺陷数据的并集为训练集；Input the example set and defect data. The example set is a subset of the old data set. The defect data includes unknown category data and known category data containing new feature values. The union of the example set and defect data is the training set;

步骤S2，特征提取器扩展：Step S2, feature extractor expansion:

构建一个新特征提取器，并使用旧模型中特征提取器的参数对新特征提取器进行初始化，得到新增特征提取器；Build a new feature extractor and initialize the new feature extractor using the parameters of the feature extractor in the old model to obtain a new feature extractor;

步骤S3，辅助分类器构建：Step S3, auxiliary classifier construction:

构建辅助分类器，将所有已知类别数据作为一个类别进行处理，计算有关新特征的辅助损失；Build an auxiliary classifier to process all known category data as one category and calculate auxiliary losses on new features;

步骤S4，特征融合器构建：Step S4, feature fusion device construction:

通过新增特征提取器从多个尺度提取太阳能电池图像的特征，将提取的所有特征连接起来，得到中间特征图，基于通道与空间注意力机制精炼中间特征图，输入中间特征图，依次推断一维注意力图和二维注意力图，通过特征融合器得到精炼特征图；Extract features of solar cell images from multiple scales through a new feature extractor, connect all the extracted features to obtain an intermediate feature map, refine the intermediate feature map based on the channel and spatial attention mechanism, input the intermediate feature map, and infer a sequence in turn One-dimensional attention map and two-dimensional attention map are used to obtain refined feature maps through the feature fusion device;

步骤S5，分类器更新：Step S5, classifier update:

从旧模型分类器中复制用于旧特征的参数构建当前阶段分类器，并按照当前阶段未知缺陷类别数量增加相应的全连接层节点，新增参数随机初始化，将精炼特征图发送到当前阶段分类器中得到预测概率，计算交叉熵损失；Copy the parameters used for the old features from the old model classifier to build the current stage classifier, and add corresponding fully connected layer nodes according to the number of unknown defect categories at the current stage. The new parameters are randomly initialized, and the refined feature map is sent to the current stage for classification. Get the predicted probability from the processor and calculate the cross entropy loss;

步骤S6，利用随机梯度下降法进行增量训练：Step S6, use stochastic gradient descent method for incremental training:

基于训练集对旧模型进行增量训练，根据包含分类损失与辅助损失的全局损失，利用随机梯度下降法，学习到未知类别样本和含有新特征值的已知类别样本，将所述未知类别样本转换为已知类别样本，得到更新后的新模型；The old model is incrementally trained based on the training set. According to the global loss including classification loss and auxiliary loss, the stochastic gradient descent method is used to learn unknown category samples and known category samples containing new feature values. The unknown category samples are Convert to known category samples to obtain an updated new model;

步骤S7，未知缺陷类别范例集构建与已知缺陷类别范例集调整：Step S7, construction of unknown defect category example set and adjustment of known defect category example set:

基于新模型训练未知缺陷类别数据和已知缺陷类别数据，分别训练得到未知类别范例集和具有新特征的已知类别范例集，所述未知类别范例集、已知类别范例集和步骤S1中的范例集合并形成新范例集，用于下一阶段增量训练；Based on the new model, unknown defect category data and known defect category data are trained to obtain an unknown category example set and a known category example set with new features, respectively. The unknown category example set, known category example set and step S1 The example sets are merged to form a new example set, which is used for the next stage of incremental training;

步骤S8，特征提取器剪枝：Step S8, feature extractor pruning:

计算新模型中的新增特征提取器各层的几何中心，基于所述几何中心剪枝该层的卷积核；Calculate the geometric center of each layer of the new feature extractor in the new model, and prune the convolution kernel of this layer based on the geometric center;

步骤S9，实现缺陷检测模型持续优化更新：Step S9: Continuously optimize and update the defect detection model:

不断输入未知类别缺陷数据和已知类别缺陷数据，重复步骤S1到S8，持续更新旧模型，得到一个对当前未知类别、已知类别全局统一的缺陷检测网络，实现缺陷检测持续优化。Continuously input unknown category defect data and known category defect data, repeat steps S1 to S8, continue to update the old model, and obtain a globally unified defect detection network for the current unknown categories and known categories to achieve continuous optimization of defect detection.

优选的，在步骤S1中，对于持续输入的缺陷数据进行优化检测，其中/>，当时，待更新模型为基础神经网络模型/>，/>由基础类别训练集/>进行训练得到。Preferably, in step S1, for the continuously input defect data Perform optimization detection, where/> ,when When , the model to be updated is the basic neural network model/> ,/> From the basic category training set/> Obtained by training.

优选的，在步骤S2中，特征提取器基于多尺度残差网络构建。Preferably, in step S2, the feature extractor is constructed based on a multi-scale residual network.

优选的，在步骤S3中，所述辅助分类器以概率进行预测分类，表达式如下：Preferably, in step S3, the auxiliary classifier uses probability For prediction classification, the expression is as follows:

； ;

其中，表示/>函数；/>表示辅助分类器，其标签空间为，/>表示未知类别的标签集合；/>表示新特征；in, Express/> function;/> represents an auxiliary classifier whose label space is ,/> Represents a label set of unknown categories;/> represents new features;

有关新特征的辅助损失的表达式如下：Auxiliary loss about new features The expression is as follows:

； ;

其中，表示训练集；/>表示图像；/>表示对应类别标签。in, Represents the training set;/> Represents an image;/> Represents the corresponding category label.

优选的，在步骤S4中，通道与空间注意力机制具体是通道注意力子网络和空间注意力子网络，通过特征融合器得到精炼特征图的具体过程如下：Preferably, in step S4, the channel and spatial attention mechanism is specifically the channel attention sub-network and the spatial attention sub-network. The specific process of obtaining the refined feature map through the feature fusion device is as follows:

通道注意力子网络通过和/>压缩中间特征图/>的空间维度，/>的空间维度为/>，生成两个特征向量/>和/>，其维度均为/>，连接两个特征向量得到/>，采用两个全卷积层对/>进行处理，表达式如下：Channel attention sub-network passes and/> Compress intermediate feature map/> The spatial dimension of ,/> The spatial dimension of is/> , generate two feature vectors/> and/> , whose dimensions are all/> , connect the two eigenvectors to get/> , using two fully convolutional layer pairs/> For processing, the expression is as follows:

； ;

其中，表示每个通道的权重；/>表示激活函数；第一个全卷积层的输出通道数为/>，/>表示缩减比率；第二个全卷积层/>的卷积核大小为/>，输出通道数为/>；/>表示通道特征；in, Indicates the weight of each channel;/> Represents the activation function; the first fully convolutional layer The number of output channels is/> ,/> Represents the reduction ratio; the second fully convolutional layer/> The convolution kernel size is/> , the number of output channels is/> ;/> Represents channel characteristics;

空间注意力子网络将通道特征作为输入，将沿通道轴的平均和最大操作应用于生成两个特征图，表达式如下：The spatial attention sub-network takes channel features as input and applies average and maximum operations along the channel axis to generate two feature maps, expressed as follows:

； ;

其中，和/>分别表示通道轴中的元素平均和最大值操作；in, and/> Represents the element average and maximum operations in the channel axis respectively;

基于和/>获得维度为/>的空间注意力图/>，表达式如下：based on and/> Get the dimension as/> Spatial attention map/> , the expression is as follows:

； ;

将空间注意力图和通道特征相乘得到精炼特征图，表达式如下：The refined feature map is obtained by multiplying the spatial attention map and channel features, and the expression is as follows:

； ;

其中，表示空间细化特征，即精炼特征图/>。in, Represents spatial refinement features, that is, refined feature maps/> .

优选的，在步骤S5中，预测概率的表达式如下：Preferably, in step S5, predict the probability The expression is as follows:

； ;

其中，表示分类器；in, represents the classifier;

交叉熵损失的表达式如下：cross entropy loss The expression is as follows:

； ;

优选的，在步骤S6中，全局损失的表达式如下：Preferably, in step S6, the global loss The expression is as follows:

； ;

其中，表示控制辅助分类器效果的超参数，对于初始模型，即/>的时候，。in, Represents the hyperparameters that control the effect of the auxiliary classifier, for the initial model, i.e./> when, .

优选的，在步骤S7中，范例集更新方式具体如下：Preferably, in step S7, the example set update method is as follows:

构建未知类别范例集，对未知类别数据集/>随机采样得到初始的未知类别范例集/>，使用模型/>初始化中间模型/>，并在/>上训练，通过随机梯度下降进行若干次迭代得到/>，迭代过程如下：Build an unknown category example set , for unknown category data set/> Randomly sample to obtain the initial unknown category example set/> , use model/> Initialize intermediate model/> , and in/> training , obtained through several iterations of stochastic gradient descent/> , the iteration process is as follows:

； ;

其中，表示微调/>的学习率，/>表示交叉熵损失，/>表示/>的梯度；in, Indicates fine-tuning/> The learning rate,/> Represents cross entropy loss, /> Express/> gradient;

计算上的/>的损失，并反向传播该验证损失以优化/>，得到最终的未知类别样本范例集/>，传播过程如下：calculate on/> loss, and backpropagates the verification loss to optimize/> , get the final example set of unknown category samples/> , the propagation process is as follows:

； ;

其中，表示未知类别范例集传播过程的学习速率，/>表示交叉熵损失，/>表示/>的梯度；in, Represents the learning rate of the propagation process of the unknown category example set,/> Represents cross entropy loss, /> Express/> gradient;

调整已知类别范例集，对已知类别数据集/>随机采样得到/>；其次，使用/>初始化中间模型/>，并在/>上训练/>，通过随机梯度下降进行若干次迭代得到/>，迭代过程如下：Adjust the set of examples of known categories , for known category data sets/> Obtained by random sampling/> ;Secondly, use/> Initialize intermediate model/> , and in/> On training/> , obtained through several iterations of stochastic gradient descent/> , the iteration process is as follows:

； ;

计算上的/>的损失，并反向传播该验证损失以优化/>，得到最终的具有新特征值的已知类别范例集/>，传播过程如下：calculate on/> loss, and backpropagates the verification loss to optimize/> , get the final set of known category examples with new feature values/> , the propagation process is as follows:

； ;

其中，表示已知样本范例集传播过程的学习速率。in, Represents the learning rate of the propagation process of the known sample example set.

优选的，在步骤S8中，所述几何中心的表示如下：Preferably, in step S8, the geometric center is expressed as follows:

； ;

其中，表示/>第/>个卷积层的第/>个卷积核，其维度为/>，/>和/>分别表示第/>个卷积层的输入通道数和输出通道数；in, Express/> No./> The convolutional layer/> A convolution kernel whose dimension is/> ,/> and/> Respectively represent the first/> The number of input channels and the number of output channels of a convolutional layer;

当第/>个卷积层的几何中心/>位于第/>个卷积层的卷积核中，满足下式：when No./> The geometric center of the convolutional layer/> Located at page/> Among the convolution kernels of a convolutional layer, the following formula is satisfied:

； ;

其中，表示第/>个卷积层中具有该层中几何中心/>相同值的卷积核，即需要剪枝的卷积核，/>表示/>和/>之差的二范数；in, Indicates the first/> A convolutional layer has a geometric center/> Convolution kernels with the same value, that is, convolution kernels that need to be pruned,/> Express/> and/> The second norm of the difference;

通过公式计算得到第个卷积层中需要剪枝的卷积核，从而实现模型剪枝，表达式如下：Calculated by the formula There are convolution kernels that need to be pruned in the convolution layer to achieve model pruning. The expression is as follows:

； ;

其中，表示第/>个卷积层中具有该层中几何中心/>相同或相似值的卷积核，即需要剪枝的卷积核，/>表示/>和/>之差的二范数。in, Indicates the first/> A convolutional layer has a geometric center/> Convolution kernels with the same or similar values, that is, convolution kernels that need to be pruned, /> Express/> and/> The second norm of the difference.

优选的，在步骤S9中，不断输入未知、已知类别缺陷数据，基于旧模型，依次施行特征提取器扩展、辅助分类器构建、特征融合器构建、分类器更新、利用随机梯度下降法进行增量训练、未知缺陷类别范例集构建与已知缺陷类别范例集调整、特征提取器剪枝步骤，得到一个对当前未知、已知类别全局统一的缺陷检测网络，实现缺陷检测持续优化；Preferably, in step S9, unknown and known category defect data are continuously input, and based on the old model, feature extractor expansion, auxiliary classifier construction, feature fusion construction, classifier update, and stochastic gradient descent method are used to increase the features in sequence. Through quantitative training, unknown defect category example set construction and known defect category example set adjustment, and feature extractor pruning steps, a globally unified defect detection network for the current unknown and known categories is obtained. , to achieve continuous optimization of defect detection;

具体的检测过程如下：The specific detection process is as follows:

基于缺陷检测网络，对所有接收到的待检测太阳能电池图像/>进行缺陷检测；首先，基于特征提取器/>，将其提取的特征通过连接获得中间特征图/>，Based on defect detection network , for all received solar cell images to be detected/> Perform defect detection; first, based on feature extractor/> , the extracted features are connected to obtain an intermediate feature map/> ,

； ;

然后，按照步骤S3，经过特征融合器，将中间特征图变换为精炼特征图/>，Then, according to step S3, through the feature fusion device, the intermediate feature map Transform into refined feature map/> ,

； ;

将精炼特征图发送到分类器/>中预测得到该太阳能电池所属缺陷类别标签/>，并根据此得到其所属缺陷类别，表达式如下：will refine the feature map Send to classifier/> Predict the defect category label of the solar cell/> , and based on this, the defect category to which it belongs is obtained. The expression is as follows:

； ;

。 .

应用本发明的技术方案，具有以下有益效果：Applying the technical solution of the present invention has the following beneficial effects:

本发明公开的基于增量学习的太阳能电池缺陷检测持续优化方法，采用初始数据集即可启动网络学习过程，并只需在各阶段存储旧数据集极小规模范例集和输入训练集，有效地降低了存储与计算负荷。网络学习过程中，持续学习未知类别样本，实现了对所有缺陷类别的精准检测。在学习未知缺陷类别时，有效减轻了网络对已知缺陷类别的灾难性遗忘。并且针对已知缺陷类别可能出现的新的表现形式，实现了已知缺陷类别检测所需的更新。The continuous optimization method for solar cell defect detection based on incremental learning disclosed by the present invention can start the network learning process by using the initial data set, and only needs to store the old data set, a very small-scale example set and the input training set at each stage, effectively Reduces storage and computing load. During the network learning process, samples of unknown categories are continuously learned to achieve accurate detection of all defect categories. When learning unknown defect categories, the network's catastrophic forgetting of known defect categories is effectively alleviated. And for the new manifestations that may appear in known defect categories, the updates required for the detection of known defect categories are implemented.

本发明中的动态特征提取器扩展、辅助分类器构建和特征融合器构建，针对未知类别的出现，为了实现模型稳定性与可塑性之间的平衡，在未知类别出现时，创建新的特征提取器，并使用旧的特征提取器的参数进行初始化，保留了已知类别的知识，同时也能有效地学习未知类别的特性。辅助分类器构建则用于训练新构建的特征提取器。特征融合器则有效地解决了增量学习中已知类别和未知类别之间可能产生的混淆问题。The dynamic feature extractor expansion, auxiliary classifier construction and feature fusion device construction in the present invention are aimed at the emergence of unknown categories. In order to achieve a balance between model stability and plasticity, a new feature extractor is created when unknown categories appear. , and initialized using the parameters of the old feature extractor, retaining the knowledge of known categories while also effectively learning the characteristics of unknown categories. The auxiliary classifier construction is used to train the newly constructed feature extractor. Feature fusion effectively solves the possible confusion problem between known categories and unknown categories in incremental learning.

另外，本发明不仅针对未知类别的出现进行范例集构建，还考虑了具有新特征值的已知类别，在未知类别的出现和已知类别的变化时，动态地构建未知类别范例集和调整已知类别范例集。本发明还提出了特征提取器剪枝，在尽可能保持模型性能的同时实现大幅度的参数削减，降低了模型的存储和计算需求，提高了模型的实用性。In addition, the present invention not only constructs an example set for the emergence of unknown categories, but also considers known categories with new feature values. When unknown categories appear and known categories change, the present invention dynamically builds unknown category example sets and adjusts existing categories. A set of examples of known categories. The present invention also proposes feature extractor pruning, which achieves substantial parameter reduction while maintaining model performance as much as possible, reduces the storage and calculation requirements of the model, and improves the practicality of the model.

除了上面所描述的目的、特征和优点之外，本发明还有其它的目的、特征和优点。下面将参照图，对本发明作进一步详细的说明。In addition to the objects, features and advantages described above, the present invention has other objects, features and advantages. The present invention will be described in further detail below with reference to the drawings.

附图说明Description of drawings

为了更清楚的说明本发明实施例或现有技术的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions of the prior art more clearly, the following will briefly introduce the drawings needed to describe the embodiments or the prior art. Obviously, the drawings in the following description are only For some embodiments of the present invention, those of ordinary skill in the art can also obtain other drawings based on these drawings without exerting creative efforts.

图1是本发明优选实施例中太阳能电池缺陷检测持续优化方法的步骤流程图；Figure 1 is a step flow chart of a method for continuous optimization of solar cell defect detection in a preferred embodiment of the present invention;

图2是本发明优选实施例中未知类别范例集和已知类别范例集的更新流程图。Figure 2 is a flow chart of updating the unknown category example set and the known category example set in the preferred embodiment of the present invention.

具体实施方式Detailed ways

本发明的核心是提供一种基于增量学习的太阳能电池缺陷检测持续优化方法。在现有技术中，太阳能电池的智能缺陷检测仍然采用传统的离线学习方式，其在训练阶段必须依赖于完整的数据集，且只能从无到有地训练一个模型，训练耗时长。从而离线学习方式在学习连续到来的新数据并进行模型持续优化时无法及时适应数据变化，因此运用于当前大规模、高维度、不断变化的数据集处理场景时，面临更新滞后和计算资源消耗大等问题。The core of the invention is to provide a continuous optimization method for solar cell defect detection based on incremental learning. In the existing technology, intelligent defect detection of solar cells still uses the traditional offline learning method, which must rely on a complete data set during the training phase, and can only train a model from scratch, and the training is time-consuming. Therefore, the offline learning method cannot adapt to data changes in time when learning the continuously arriving new data and continuously optimizing the model. Therefore, when applied to the current large-scale, high-dimensional, and constantly changing data set processing scenarios, it faces update lag and high consumption of computing resources. And other issues.

为了使本技术领域的人员更好地理解本发明方案，下面结合附图和具体实施方式对本发明作进一步的详细说明。显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. Obviously, the described embodiments are only some of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of the present invention.

实施例：Example:

请参考图1，图1为本发明优选实施例中太阳能电池缺陷检测持续优化方法的步骤流程图，其中步骤S1-S8为基于旧模型中的特征和已知类别数据更新模型，步骤S9为持续更新旧模型，从而实现一个模型持续优化更新的太阳能电池缺陷检测网络。Please refer to Figure 1. Figure 1 is a step flow chart of a continuous optimization method for solar cell defect detection in a preferred embodiment of the present invention. Steps S1-S8 are updating the model based on features in the old model and known category data, and step S9 is continuing Update the old model to achieve a solar cell defect detection network with continuously optimized and updated models.

参见图1，在本发明实施例中，太阳能电池缺陷检测持续优化方法包括如下步骤：Referring to Figure 1, in the embodiment of the present invention, a method for continuous optimization of solar cell defect detection includes the following steps:

输入太阳能电池数据的范例集和缺陷数据，所述范例集为旧数据集（旧数据集为旧模型中用于训练的缺陷数据）的子集，所述缺陷数据包括未知类别数据和含有新特征值的已知类别数据，范例集和缺陷数据的并集为训练集；对于持续输入的缺陷数据进行分阶段的优化检测，其中/>，当/>时，待更新模型为基础神经网络模型/>，/>由基础类别训练集/>进行训练得到。Enter the example set and defect data of solar cell data. The example set is a subset of the old data set (the old data set is the defect data used for training in the old model). The defect data includes unknown category data and contains new features. The union of the known category data of the value, the example set and the defect data is the training set; for the continuously input defect data Carry out phased optimization testing, in which/> , when/> When , the model to be updated is the basic neural network model/> ,/> From the basic category training set/> Obtained by training.

步骤S2，特征提取器扩展：Step S2, feature extractor expansion:

构建一个新特征提取器，并使用所有的已有特征提取器的参数对新特征提取器进行初始化，得到新增特征提取器；特征提取器基于多尺度残差网络（Inception-ResNet，该网络模型为现有技术，此处不进行赘述）构建。在本实施例中，为了避免旧知识的灾难性遗忘并保持之前已经学习的知识，冻结当前所有特征提取器的参数。然后，为了更有效地适应未知类别的特征分布从而对未知缺陷类别进行更准确的检测，构建一个新的特征提取器/>，并复制/>的参数以初始化/>。Construct a new feature extractor and use all the parameters of the existing feature extractor to initialize the new feature extractor to obtain a new feature extractor; the feature extractor is based on the multi-scale residual network (Inception-ResNet, the network model is an existing technology and will not be described in detail here). In this embodiment, in order to avoid catastrophic forgetting of old knowledge and maintain previously learned knowledge, all current feature extractors are frozen parameters. Then, in order to more effectively adapt to the feature distribution of unknown categories and thereby perform more accurate detection of unknown defect categories, a new feature extractor is constructed/> , and copy/> parameters to initialize/> .

步骤S3，辅助分类器构建：Step S3, auxiliary classifier construction:

构建辅助分类器，将所有已知类别数据作为一个类别进行处理，计算有关新特征的辅助损失；所述辅助分类器以概率进行预测分类，表达式如下：Construct an auxiliary classifier, process all known category data as one category, and calculate auxiliary losses related to new features; the auxiliary classifier uses probability For prediction classification, the expression is as follows:

； ;

步骤S4，特征融合器构建：Step S4, feature fusion device construction:

通过新增特征提取器从多个尺度提取太阳能电池图像的特征，将提取的所有特征连接起来，得到中间特征图，基于通道与空间注意力机制输入中间特征图，特征融合器依次推断一维注意力图和二维注意力图，通过特征融合器得到精炼特征图；The features of the solar cell image are extracted from multiple scales through a new feature extractor, and all the extracted features are connected to obtain an intermediate feature map. The intermediate feature map is input based on the channel and spatial attention mechanism, and the feature fusion device infers one-dimensional attention in turn. Force map and two-dimensional attention map are used to obtain refined feature maps through feature fusion;

在本实施例中，通道与空间注意力机制具体是通道注意力子网络和空间注意力子网络，通过特征融合器得到精炼特征图的具体过程如下：In this embodiment, the channel and spatial attention mechanisms are specifically the channel attention sub-network and the spatial attention sub-network. The specific process of obtaining the refined feature map through the feature fusion device is as follows:

通道注意力子网络通过（全局平均池化）和/>（全局最大池化）压缩中间特征图/>的空间维度，/>的空间维度为/>（C表示特征图的通道数；H表示特征图的高度；W表示特征图的宽度），生成两个特征向量/>和/>，其维度均为，连接两个特征向量得到/>，采用两个全卷积层对/>进行处理，表达式如下：Channel attention sub-network passes (global average pooling) and/> (Global max pooling) compresses intermediate feature maps/> The spatial dimension of ,/> The spatial dimension of is/> (C represents the number of channels of the feature map; H represents the height of the feature map; W represents the width of the feature map), generating two feature vectors/> and/> , whose dimensions are all , connect the two eigenvectors to get/> , using two fully convolutional layer pairs/> For processing, the expression is as follows:

； ;

步骤S5，分类器更新：Step S5, classifier update:

从旧模型分类器中复制用于旧特征的参数构建新模型的分类器，并按照旧模型中的未知缺陷类别数量增加相应的全连接层节点，新增参数随机初始化，将精炼特征图发送到分类器中得到预测概率，计算交叉熵损失；Copy the parameters used for the old features from the old model classifier to build a new model classifier, and add corresponding fully connected layer nodes according to the number of unknown defect categories in the old model. The new parameters are randomly initialized and the refined feature map is sent to The predicted probability is obtained from the classifier and the cross-entropy loss is calculated;

具体的，预测概率的表达式如下：Specifically, the predicted probability The expression is as follows:

； ;

其中，表示分类器；in, represents the classifier;

。 .

具体的，在步骤S6中，全局损失的表达式如下：Specifically, in step S6, the global loss The expression is as follows:

； ;

基于新模型训练未知缺陷类别数据和已知缺陷类别数据，分别训练得到未知类别范例集和具有新特征的已知类别范例集，所述未知类别范例集、已知类别范例集和步骤S1中的范例集合并形成新范例集，用于新模型的增量训练；Based on the new model, unknown defect category data and known defect category data are trained to obtain an unknown category example set and a known category example set with new features, respectively. The unknown category example set, known category example set and step S1 The example sets are merged to form a new example set for incremental training of new models;

具体的，如图2所示，范例集更新方式具体如下：Specifically, as shown in Figure 2, the example set update method is as follows:

构建未知类别范例集，对未知类别数据集/>随机采样得到未知类别范例集/>，使用模型/>初始化中间模型/>，并在/>上训练/>，通过随机梯度下降进行若干次迭代得到/>，迭代过程如下：Build an unknown category example set , for unknown category data set/> Randomly sample to obtain an unknown category example set/> , use model/> Initialize intermediate model/> , and in/> On training/> , obtained through several iterations of stochastic gradient descent/> , the iteration process is as follows:

； ;

其中，表示微调/>的学习率；in, Indicates fine-tuning/> learning rate;

计算上的/>的损失，并反向传播上述验证损失以优化/>，得到最终的未知类别样本范例集/>，传播过程如下：calculate on/> loss, and backpropagate the above verification loss to optimize/> , get the final example set of unknown category samples/> , the propagation process is as follows:

； ;

其中，表示未知类别范例集传播过程的学习速率；in, Represents the learning rate of the propagation process of the unknown category example set;

调整已知类别范例集，对新模型的已知类别数据集/>随机采样得到/>。其次，使用/>初始化，得到中间模型/>，并在/>上训练/>，通过随机梯度下降进行若干次迭代得到/>，迭代过程如下：Adjust the set of examples of known categories , a known category data set for the new model/> Obtained by random sampling/> . Second, use/> Initialize and get the intermediate model/> , and in/> On training/> , obtained through several iterations of stochastic gradient descent/> , the iteration process is as follows:

； ;

步骤S8，特征提取器剪枝：Step S8, feature extractor pruning:

具体的，所述几何中心的表示如下：Specifically, the geometric center is expressed as follows:

； ;

需要说明的是，如果在同一层中，某些卷积核的值与该层的几何中心接近或相等，那么这些卷积核可以由该层的其他卷积核表示，剪枝这些卷积核对网络性能的影响很小，通过剪枝这些卷积核实现保持网络性能的同时，优化模型的存储需求并加速训练与预测。It should be noted that if in the same layer, the values of some convolution kernels are close to or equal to the geometric center of the layer, then these convolution kernels can be represented by other convolution kernels of the layer, and these convolution kernel pairs are pruned. The impact on network performance is very small. By pruning these convolution kernels, it is possible to maintain network performance while optimizing the storage requirements of the model and accelerating training and prediction.

不断输入未知类别缺陷数据和已知类别缺陷数据，重复步骤S1到S8，持续更新网络，得到一个对当前未知类别、已知类别全局统一的缺陷检测网络，实现缺陷检测持续优化。Continuously input unknown category defect data and known category defect data, repeat steps S1 to S8, and continuously update the network to obtain a globally unified defect detection network for the current unknown categories and known categories to achieve continuous optimization of defect detection.

具体的，不断输入未知、已知类别缺陷数据，基于旧模型，依次施行特征提取器扩展、辅助分类器构建、特征融合器构建、分类器更新、利用随机梯度下降法进行增量训练、未知缺陷类别范例集构建与已知缺陷类别范例集调整、特征提取器剪枝步骤，得到一个对当前未知、已知类别全局统一的缺陷检测网络，从而实现缺陷检测持续优化；Specifically, unknown and known categories of defect data are continuously input, and based on the old model, feature extractor expansion, auxiliary classifier construction, feature fusion construction, classifier update, incremental training using stochastic gradient descent method, and unknown defects are sequentially performed. Class example set construction, adjustment of known defect category example sets, and feature extractor pruning steps yield a globally unified defect detection network for currently unknown and known categories. , thereby achieving continuous optimization of defect detection;

具体的检测过程如下：The specific detection process is as follows:

基于缺陷检测网络，对所有接收到的待检测太阳能电池图像/>进行缺陷检测。首先，基于特征提取器/>，将其提取的特征通过连接获得中间特征图/>，Based on defect detection network , for all received solar cell images to be detected/> Perform defect detection. First, based on the feature extractor/> , the extracted features are connected to obtain an intermediate feature map/> ,

； ;

。 .

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims

1. A continuous optimization method for solar cell defect detection based on incremental learning, which is characterized by including the following steps:

Step S1, defect data continues to be input:

Input the example set and defect data. The example set is a subset of the old data set. The defect data includes unknown category data and known category data containing new feature values. The union of the example set and defect data is the training set;

Step S2, feature extractor expansion:

Build a new feature extractor and initialize the new feature extractor using the parameters of the feature extractor in the old model to obtain a new feature extractor;

Step S3, auxiliary classifier construction:

Build an auxiliary classifier to process all known category data as one category and calculate auxiliary losses on new features;

Step S4, feature fusion device construction:

Extract features of solar cell images from multiple scales through a new feature extractor, connect all the extracted features to obtain an intermediate feature map, refine the intermediate feature map based on the channel and spatial attention mechanism, input the intermediate feature map, and infer a sequence in turn One-dimensional attention map and two-dimensional attention map are used to obtain refined feature maps through the feature fusion device;

Step S5, classifier update:

Copy the parameters used for the old features from the old model classifier to build the current stage classifier, and add corresponding fully connected layer nodes according to the number of unknown defect categories at the current stage. The new parameters are randomly initialized, and the refined feature map is sent to the current stage for classification. Get the predicted probability from the processor and calculate the cross entropy loss;

Step S6, use stochastic gradient descent method for incremental training:

The old model is incrementally trained based on the training set. According to the global loss including classification loss and auxiliary loss, the stochastic gradient descent method is used to learn unknown category samples and known category samples containing new feature values. The unknown category samples are Convert to known category samples to obtain an updated new model;

Step S7, construction of unknown defect category example set and adjustment of known defect category example set:

Based on the new model, unknown defect category data and known defect category data are trained to obtain an unknown category example set and a known category example set with new features, respectively. The unknown category example set, known category example set and step S1 The example sets are merged to form a new example set, which is used for the next stage of incremental training;

Step S8, feature extractor pruning:

Calculate the geometric center of each layer of the new feature extractor in the new model, and prune the convolution kernel of this layer based on the geometric center;

Step S9: Continuously optimize and update the defect detection model:

Continuously input unknown category defect data and known category defect data, repeat steps S1 to S8, continue to update the old model, and obtain a globally unified defect detection network for the current unknown categories and known categories to achieve continuous optimization of defect detection.

2. The continuous optimization method for solar cell defect detection according to claim 1, characterized in that, in step S1, for the continuously input defect data Perform optimization detection, where/> , when/> When , the model to be updated is the basic neural network model/> ,/> From the basic category training set/> Obtained by training.

3. The continuous optimization method for solar cell defect detection according to claim 2, characterized in that, in step S2, the feature extractor is constructed based on a multi-scale residual network.

4. The method for continuous optimization of solar cell defect detection according to claim 3, characterized in that, in step S3, the auxiliary classifier uses probability For prediction classification, the expression is as follows:

;

in, Express/> function;/> represents an auxiliary classifier whose label space is ,/> Represents a label set of unknown categories;/> represents new features;

Auxiliary loss about new features The expression is as follows:

;

in, Represents the training set;/> Represents an image;/> Represents the corresponding category label.

5. The continuous optimization method for solar cell defect detection according to claim 4, characterized in that, in step S4, the channel and spatial attention mechanism is specifically a channel attention sub-network and a spatial attention sub-network, through a feature fusion device The specific process of obtaining the refined feature map is as follows:

Channel attention sub-network passes and/> Compress intermediate feature map/> The spatial dimension of ,/> The spatial dimension of is/> , generate two feature vectors/> and/> , whose dimensions are all/> , connect the two eigenvectors to get/> , using two fully convolutional layer pairs/> For processing, the expression is as follows:

;

in, Indicates the weight of each channel;/> Represents the activation function; the first fully convolutional layer The number of output channels is/> ,/> Represents the reduction ratio; the second fully convolutional layer/> The convolution kernel size is/> , the number of output channels is/> ;/> Represents channel characteristics;

The spatial attention sub-network takes channel features as input and applies average and maximum operations along the channel axis to generate two feature maps, expressed as follows:

;

in, and/> Represents the element average and maximum operations in the channel axis respectively;

based on and/> Get the dimension as/> Spatial attention map/> , the expression is as follows:

;

The refined feature map is obtained by multiplying the spatial attention map and channel features, and the expression is as follows:

;

in, Represents spatial refinement features, that is, refined feature maps/> .

6. The method for continuous optimization of solar cell defect detection according to claim 5, characterized in that, in step S5, the prediction probability The expression is as follows:

;

in, represents the classifier;

cross entropy loss The expression is as follows:

;

7. The continuous optimization method for solar cell defect detection according to claim 6, characterized in that, in step S6, the global loss The expression is as follows:

;

in, Represents the hyperparameters that control the effect of the auxiliary classifier, for the initial model, i.e./> when, .

8. The continuous optimization method for solar cell defect detection according to claim 7, characterized in that, in step S7, the example set update method is as follows:

Build an unknown category example set , for unknown category data set/> Randomly sample to obtain the initial unknown category example set/> , use model/> Initialize intermediate model/> , and in/> On training/> , obtained through several iterations of stochastic gradient descent/> , the iteration process is as follows:

;

in, Indicates fine-tuning/> The learning rate,/> Represents cross entropy loss, /> Express/> gradient;

calculate on/> loss, and backpropagates the verification loss to optimize/> , get the final example set of unknown category samples/> , the propagation process is as follows:

;

in, Represents the learning rate of the propagation process of the unknown category example set,/> Represents cross entropy loss, /> Express/> gradient;

Adjust the set of examples of known categories , for known category data sets/> Obtained by random sampling/> ;Secondly, use/> Initialize intermediate model/> , and in/> Up training/> , obtained through several iterations of stochastic gradient descent/> , the iteration process is as follows:

;

calculate on/> loss, and backpropagates the verification loss to optimize/> , get the final set of known category examples with new feature values/> , the propagation process is as follows:

;

in, Represents the learning rate of the propagation process of the known sample example set.

9. The continuous optimization method for solar cell defect detection according to claim 8, characterized in that, in step S8, the geometric center is expressed as follows:

;

in, Express/> No./> The convolutional layer/> A convolution kernel whose dimension is/> ,/> and/> Respectively represent the first/> The number of input channels and the number of output channels of a convolutional layer;

when No./> The geometric center of the convolutional layer/> Located at page/> Among the convolution kernels of a convolutional layer, the following formula is satisfied:

;

in, Indicates the first/> A convolutional layer has a geometric center/> Convolution kernels with the same value, that is, convolution kernels that need to be pruned,/> Express/> and/> The second norm of the difference;

Calculated by the formula There are convolution kernels that need to be pruned in the convolution layer to achieve model pruning. The expression is as follows:

;

in, Indicates the first/> A convolutional layer has a geometric center/> Convolution kernels with the same or similar values, that is, convolution kernels that need to be pruned, /> Express/> and/> The second norm of the difference.

10. The continuous optimization method for solar cell defect detection according to claim 9, characterized in that, in step S9, unknown and known category defect data are continuously input, and feature extractor expansion and auxiliary classifier are sequentially implemented based on the old model. Construction, feature fusion construction, classifier update, incremental training using stochastic gradient descent method, unknown defect category example set construction and known defect category example set adjustment, feature extractor pruning step, to obtain a set of currently unknown and already known defect categories. A globally unified defect detection network with known categories , to achieve continuous optimization of defect detection;

The specific detection process is as follows:

Based on defect detection network , for all received solar cell images to be detected/> Perform defect detection; first, based on feature extractor/> , the extracted features are connected to obtain an intermediate feature map/> ,

;

Then, according to step S3, through the feature fusion device, the intermediate feature map Transform into refined feature map/> ,

;

will refine the feature map Send to classifier/> Predict the defect category label of the solar cell/> , and based on this, the defect category to which it belongs is obtained. The expression is as follows:

;

.