CN114494254A

CN114494254A - Product appearance defect classification method based on fusion of GLCM and CNN-Transformer and storage medium

Info

Publication number: CN114494254A
Application number: CN202210388135.4A
Authority: CN
Inventors: 岳晨; 黄鑫; 裴孝怀; 钟智敏; 刘伟; 王筱圃
Original assignee: Hkust Intelligent Internet Of Things Technology Co ltd
Current assignee: Hkust Intelligent Internet Of Things Technology Co ltd
Priority date: 2022-04-14
Filing date: 2022-04-14
Publication date: 2022-05-13
Anticipated expiration: 2042-04-14
Also published as: CN114494254B

Abstract

The invention discloses a product appearance defect classification method and a storage medium based on GLCM and CNN-Transformer fusion, wherein the method adopts an effective pixel probability distribution area equal division method to perform gray level dimension reduction on a preprocessed product sample image, obtains a gray level co-occurrence matrix GLCM and a statistic matrix through a multi-channel regional calculation and combination method, introduces 3 1 x 1 convolution kernels as a conversion module to perform dimension reduction on the gray level co-occurrence matrix GLCM, performs feature extraction by using multilayer residual CNN, fuses with a Swin transform Block structure, and performs secondary feature fusion on the fusion result of the statistic matrix and the multilayer residual CNN feature extraction, thereby constructing a product defect classification model fusing the GLCM and the CNN-Transformer. The method combines the outstanding performance of the GLCM on the texture feature extraction and the advantages of the CNN and the Transformer on the image classification, and can better realize the classification of the product appearance with obvious defect features.

Description

Product appearance defect classification method based on fusion of GLCM and CNN-Transformer and storage medium

Technical Field

The invention relates to the technical field of textile chemical fiber industrial detection, in particular to a product appearance defect classification method and a storage medium based on fusion of GLCM and CNN-Transformer.

Background

At present, on product appearance defect's detection, mainly adopt handheld flashlight, the artifical quality control mode that the multi-angle was polished, this kind of detection mode working strength is great, easily receive people's subjective factor influence, inefficiency, consequently, needs adopt more effective mode to accomplish the detection to product appearance defect.

Disclosure of Invention

The invention provides a product appearance defect classification method based on fusion of GLCM and CNN-Transformer, which can solve the technical problems.

In order to realize the purpose, the invention adopts the following technical scheme:

a product appearance defect classification method based on GLCM and CNN-Transformer fusion comprises the following steps,

a sample preprocessing step, wherein the sample preprocessing step comprises the steps of obtaining a product sample image, extracting mask images of an effective area and a maximum outline area, and carrying out AND operation processing to obtain a product sample preprocessing effect image only retaining the mask areas;

calculating, wherein the calculating step comprises the steps of carrying out gray level dimensionality reduction by adopting an effective pixel probability distribution area equal division method, and obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by adopting a multi-channel regional calculation and combination method;

a fusion step, wherein the fusion step comprises the steps of introducing 3 1 × 1 convolution kernels as conversion modules, performing dimensionality reduction treatment on the gray level co-occurrence matrix GLCM, performing feature extraction by using multilayer residual errors CNN, and fusing a feature extraction result and a Swin transform Block structure in a Block-by-Block Merging Patch Merging manner to obtain a fusion result of the gray level co-occurrence matrix GLCM; meanwhile, performing secondary feature fusion on the fusion result of the statistic matrix and the gray level co-occurrence matrix GLCM by adopting a concat connection mode, so as to construct a product defect classification model fused by the gray level co-occurrence matrix GLCM and the CNN-Transformer, and further classify the product appearance graph.

Further, the sample preprocessing step specifically includes:

s1.1, obtaining original image sample data of the product appearance;

s1.2, preprocessing the original image sample data, then obtaining a maximum outline circumscribed rectangle coordinate, and drawing a maximum outline area mask graph;

s1.3, mapping to an original image according to the maximum outline circumscribed rectangle coordinate, and intercepting an effective area to obtain an effective area original image;

and S1.4, performing AND operation on the effective region original image and the maximum outline region mask image, reserving pixel values corresponding to the mask region, and setting other pixel points to be (0,0,0), so as to obtain a pretreatment effect image of the sample.

Further, the calculating step specifically includes,

s2.1.1, obtaining a preprocessed sample graph by using a processing method of the sample preprocessing module;

s2.1.2, inputting the obtained preprocessed sample image into a multi-channel regional gray level co-occurrence matrix calculation module to obtain a gray level co-occurrence matrix GLCM;

s2.1.3, inputting the gray level co-occurrence matrix of each block image obtained by the calculation of the multi-channel sub-area gray level co-occurrence matrix calculation module into the multi-channel sub-area gray level co-occurrence matrix statistic calculation module to obtain a statistic matrix.

Further, the fusing step specifically includes:

s2.2.1, inputting the gray level co-occurrence matrix GLCM obtained by calculation in the step S2.1.2 into a conversion module, and performing feature dimension reduction processing on the input gray level co-occurrence matrix GLCM by adopting 3 1 × 1 convolution kernels;

s2.2.2, inputting the statistic matrix obtained in the step S2.1.3 into a Transformer fusion module of the CNN-Transformer fusion module, and inputting the processing result in the step S2.2.1 into the CNN fusion module of the CNN-Transformer fusion module;

s2.2.3, inputting the calculation result of the CNN-Transformer fusion module into a multi-layer perceptron MLP module.

Further, the method for calculating the multi-channel gray level co-occurrence matrix by regions in step S2.1.2 specifically includes:

s2.1.2.1, dividing the preprocessed sample image into RGB three channels, dividing the gray level co-occurrence matrix GLCM into 16 gray levels, and respectively calculating the segmentation gray level threshold value converted from 256 gray levels into 16 gray levels for R, G, B three channels of the preprocessed sample image by adopting an effective pixel probability distribution area equal division method;

s2.1.2.2, uniformly dividing each channel image after step S2.1.2.1 into 14 × 14 block images, and calculating the position of each block image

、

、

、

The gray level co-occurrence matrixes GLCM in 4 directions are counted and normalized, and the gray level co-occurrence matrixes GLCM of all the block images are combined in 4 directions, so thatAnd each channel image obtains 4 gray level co-occurrence matrixes GLCM, and 3 channel RGB images obtain 12 gray level co-occurrence matrixes GLCM in total.

Further, the method for equally dividing the probability distribution area of the effective pixels in S2.1.2.1 specifically includes:

the method comprises the steps of filtering pixel points with pixel values of (0,0,0) in a preprocessed sample graph, reserving the rest effective pixel points, counting probability distribution of the effective pixel points on 256 gray levels, obtaining a probability distribution graph, dividing the probability distribution graph into 16 areas with equal areas according to the horizontal direction, and calculating each segmentation gray threshold; the calculation method is as follows, the number of effective pixel points on each gray level is assumed to be

Each division threshold value is

Wherein

Then the following inequality is established

J is an integer, to find the inequality satisfied in each cycle traversal stage

To find 15 division gray level thresholds.

Further, the calculating step of the multi-channel partitioned area GLCM statistic calculating module in step S2.1.3 specifically includes:

according to the correlation calculation result of the multi-channel regional gray level co-occurrence matrix calculation module, calculating 14 statistics of GLCM of each block image of each RGB channel, namely: energy, entropy, contrast, uniformity, correlation, variance, sum average, sum variance, sum entropy, difference variance, difference average, difference entropy, correlation information measure and maximum correlation coefficient are respectively combined with statistics of the block images of the RGB channels to obtain 14 statistic matrixes of each channel image, and the 3 channel RGB images are summed to obtain 42 statistic matrixes.

Further, S2.2.2, inputting the statistic matrix obtained in step S2.1.3 to the Transformer fusion module of the CNN-Transformer fusion module, and inputting the processing result of step S2.2.1 to the CNN fusion module of the CNN-Transformer fusion module, specifically including:

performing feature extraction by using a multilayer residual CNN module, fusing a feature extraction result with a Swin transform Block structure in a Patch measuring mode, and performing secondary feature fusion on the statistic matrix obtained in the step S2.1.3 and a fusion result of the multilayer residual CNN feature extraction in a concat connection mode;

the CNN fusion module comprises 1 max-pooling layer, 2 residual error layers with the number of output channels being 96 and composed of convolution kernels of 3 multiplied by 3+1 multiplied by 3+3 multiplied by 1, 1 residual error layer with the number of output channels being 192 and composed of convolution kernels of 3 multiplied by 3+1 multiplied by 3+3 multiplied by 1, and 1 Patch gathering layer with the number of output channels being 384; the Transformer fusion module mainly comprises 4 Swin Transformer Block layers with 426 stacked output channels, 1 Patch metering layer with 852 stacked output channels, and 2 Swin Transformer Block layers with 852 stacked output channels.

Further, the product defect classification model includes the following two lightweight classification models, specifically including:

a. only the fusion step of the gray level co-occurrence matrix GLCM is reserved, concat connection of the statistic matrix and a main stem of a gray level co-occurrence matrix fusion structure is removed, and the fusion step is that the statistic matrix is processed by a gray level co-occurrence matrix module, a conversion module, a CNN fusion module and a Transformer fusion module in sequence and then input into a multi-layer sensor MLP module;

b. and only reserving a fusion step of the statistic matrix and the Transformer fusion module, and removing a structural backbone of the gray level co-occurrence matrix fusion CNN fusion module, wherein the fusion step is to input the processed structural backbone into the multi-layer sensor MLP module after the processed structural backbone is processed by the statistic matrix module and the Transformer fusion module in sequence.

In another aspect, the present invention also discloses a computer readable storage medium storing a computer program, which when executed by a processor causes the processor to perform the steps of the method as described above.

According to the technical scheme, the method for classifying the product appearance defects based on the fusion of the GLCM and the CNN-Transformer adopts an effective pixel probability distribution area equal division method to perform gray level dimension reduction on a preprocessed product appearance sample image, obtains a gray level co-occurrence matrix GLCM and a statistic matrix through a multi-channel regional calculation and combination method, introduces 3 1 x 1 convolution kernels as a conversion module to perform dimension reduction on the gray level co-occurrence matrix GLCM, performs feature extraction by using multilayer residual CNN, fuses with a Swin transform Block structure, performs secondary feature fusion on the fusion result of the statistic matrix and the multilayer residual CNN feature extraction, and accordingly constructs a product defect classification model fusing the GLCM and the CNN-Transformer. The method combines the outstanding performance of the gray level co-occurrence matrix GLCM on the texture feature extraction and the advantages of the CNN and the Transformer on the image classification, and can better realize the classification of the product appearance with obvious defect features.

Drawings

FIG. 1 is a block diagram of the present invention;

FIG. 2a is a sample diagram of a non-formed cake according to an embodiment of the present invention;

FIG. 2b is a drawing of a typical cake forming sample of an embodiment of the present invention;

FIG. 3a is a graph of the effect of pretreatment of a non-formed cake sample according to an embodiment of the present invention;

FIG. 3b is a graph of the effect of pretreatment of a typical cake forming sample according to an embodiment of the present invention;

FIG. 4 is a flow diagram of a calculation module of an embodiment of the invention;

FIG. 5 is a block diagram of the non-shaped sample shown in FIG. 3b, which is divided into 14 × 14 block images according to an embodiment of the present invention;

FIG. 6 is a Block diagram of Swin Transformer Block according to an embodiment of the present invention;

FIG. 7 is a flow diagram of a fusion module according to an embodiment of the present invention;

FIG. 8a is a flow chart of a fusion module derived from an embodiment of the present invention;

FIG. 8b is a flow chart of another fusion module derived from embodiments of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.

The poor forming of the spinning cake is the most common defect type in appearance defects of the spinning cake, and the most prominent characteristic of the forming defects is the abnormity of obvious running traces of the spinning cake on the surface of the spinning cake, which seriously influences the quality of the spinning cake and is the defect which needs to be detected. Therefore, the method for classifying product appearance defects based on the fusion of GLCM and CNN-Transformer in this embodiment is specifically a method for classifying spinning cake appearance molding based on the fusion of GLCM and CNN-Transformer, which replaces an automatic method for classifying spinning cake molding defects by manual quality inspection, and specifically includes the following steps:

as shown in fig. 1, the method for classifying appearance of spinning cake based on fusion of GLCM and CNN-Transformer in this embodiment includes,

a sample preprocessing step, wherein the sample preprocessing step comprises the steps of obtaining a spinning cake sample picture, extracting mask pictures of an effective area and a maximum outline area, and carrying out AND operation processing to obtain a spinning cake sample preprocessing effect picture only keeping the mask areas;

the fusion step comprises the steps of introducing 3 1 × 1 convolution kernels as conversion modules, performing dimensionality reduction on the gray level co-occurrence matrix GLCM, performing feature extraction by using multilayer residual errors CNN, and fusing a feature extraction result and a Swin transform Block structure in a Block Merging Patch Merging mode to obtain a fusion result of the gray level co-occurrence matrix GLCM; meanwhile, a concat connection mode is adopted, secondary feature fusion is carried out on a fusion result of the statistic matrix and the gray level co-occurrence matrix GLCM, so that a product defect classification model fused with the gray level co-occurrence matrix GLCM and the CNN-Transformer is constructed, and then the appearance pattern of the spinning cake is classified.

The following are specifically described:

s1, a sample preprocessing module: and (4) preprocessing the spinning cake sample image to obtain a sample preprocessing effect image only retaining the mask area.

S1.1, obtaining original image sample data of the right upper end face of a spinning cake;

as shown in fig. 2, a raw plot of a typical cake-formed sample versus an unformed sample is listed.

S1.2, preprocessing original image sample data, then obtaining circumscribed rectangular coordinates of a maximum outline, and drawing a mask graph of a maximum outline area;

for the treatment of this step, the following method can be employed: performing image graying processing, solving a segmentation threshold value by using a maximum inter-class variance method, converting the segmentation threshold value into a binary image, performing open operation processing on the binary image, removing isolated tiny noise points, extracting an image contour, solving a circumscribed rectangle of the maximum contour, and drawing a mask image of a maximum contour region.

S1.3, mapping to an original image according to the solved maximum outline external rectangular coordinate, and intercepting an effective area to obtain an effective area original image;

and S1.4, performing AND operation processing on the obtained effective region original image and the obtained maximum outline region mask image, reserving pixel values corresponding to the mask region, and setting other pixel points to be (0,0,0), so as to obtain a preprocessing effect image of the sample.

As shown in fig. 3a and 3b, methods adopted by the sample preprocessing module according to the present invention are shown, which are sample preprocessing effects finally obtained after processing the original images shown in fig. 2a and 2b, respectively.

S2, a classification module: the classification module comprises a calculation module and a fusion module 2.

S2.1, a calculation module: as shown in fig. 4, a specific process of the calculation module is shown, a gray scale dimensionality reduction is performed by adopting an effective pixel probability distribution area equal division method, and a gray level co-occurrence matrix GLCM and a statistic matrix are obtained by a multi-channel regional calculation and combination method.

as shown in fig. 3, a diagram of a preprocessed sample is obtained by the sample preprocessing module.

S2.1.2, inputting the obtained preprocessed sample image into a multi-channel regional gray level co-occurrence matrix calculation module to obtain a gray level co-occurrence matrix;

the method for computing the GLCM of the multi-channel regional gray level co-occurrence matrix specifically comprises the following steps:

the method for equally dividing the probability distribution area of the effective pixel specifically comprises the following steps: the method comprises the steps of filtering pixel points with pixel values of (0,0,0) in a preprocessed sample graph, reserving the rest effective pixel points, counting probability distribution of the effective pixel points on 256 gray levels, obtaining a probability distribution graph, dividing the probability distribution graph into 16 areas with equal areas according to the horizontal direction, and calculating each segmentation gray threshold; the calculation method is as follows, the number of effective pixel points on each gray level is assumed to be

Each division threshold value is

Wherein

Then the following inequality is established

J is an integer, and the inequality is satisfied in each cycle traversal stageIs/are as follows

To find 15 division gray level thresholds.

、

、

、

And 4 gray level co-occurrence matrixes GLCM in 4 directions are counted and normalized, and the gray level co-occurrence matrixes of all the block images are combined in 4 directions, so that 4 gray level co-occurrence matrixes GLCM are obtained from each channel image, and 12 gray level co-occurrence matrixes GLCM are obtained from 3 channel RGB images.

As shown in FIG. 5, the pre-processing effect map of the non-shaped sample shown in FIG. 3b is divided into 14 × 14 block images, and each block image is calculated

、

、

、

And (3) combining the gray level co-occurrence matrixes GLCM in 4 directions to obtain 12 gray level co-occurrence matrixes GLCM with the size of 224x 224.

The multi-channel regional GLCM statistic calculation module specifically comprises: according to the correlation calculation result of the multi-channel regional gray level co-occurrence matrix calculation module, calculating 14 statistics of GLCM of each block image of each RGB channel, namely: energy, entropy, contrast, uniformity, correlation, variance, sum average, sum variance, sum entropy, difference variance, difference average, difference entropy, correlation information measure and maximum correlation coefficient are respectively combined with statistics of the block images of the RGB channels to obtain 14 statistic matrixes of each channel image, and the 3 channel RGB images are summed to obtain 42 statistic matrixes.

As shown in fig. 4, a total of 42 statistical matrices of size 14 × 14 are finally obtained.

S2.2, a fusion module: as shown in fig. 7, a specific flow of the fusion module is shown, 3 1 × 1 convolution kernels are introduced as a conversion module, after the dimension reduction processing is performed on the gray level co-occurrence matrix GLCM, the multi-layer residual error CNN is used for feature extraction, the feature extraction result is fused with the Swin transform Block (shown in fig. 6) structure body in a Patch gathering manner, and meanwhile, the statistic matrix and the fusion result of the multi-layer residual error CNN feature extraction are subjected to secondary feature fusion in a concat connection manner, so that a product defect classification model fused with the GLCM and the multi-layer residual error CNN-transform is constructed.

S2.2.1, inputting the calculated gray level co-occurrence matrix GLCM into the conversion module.

The conversion module adopts 3 1 × 1 convolution kernels to perform feature dimension reduction processing on the input gray level co-occurrence matrix GLCM, so that the calculation amount in the CNN fusion module is reduced.

S2.2.2, inputting the obtained statistic matrix into a Transformer fusion module, and inputting the obtained gray level co-occurrence matrix GLCM into a CNN fusion module through a conversion module, wherein the CNN-Transformer fusion module comprises a CNN fusion module and a Transformer fusion module.

The CNN-Transformer fusion module utilizes multilayer residual CNN to extract features, fuses the feature extraction results with a Swin Transformer Block structure in a Patch measuring mode, and simultaneously performs secondary feature fusion on the statistic matrix and the fusion results of the multilayer residual CNN feature extraction in a concat connection mode. As shown in fig. 7, the CNN fusion module mainly includes 1 max-pooling layer, 2 residual layers with output channels of 96 and composed of 3 × 3+1 × 3+3 × 1 convolution kernels, 1 residual layer with output channels of 192 and composed of 3 × 3+1 × 3+3 × 1 convolution kernels, and 1 Patch gathering layer with output channels of 384. The Transformer fusion module mainly comprises 4 Swin Transformer Block layers with 426 stacked output channels, 1 Patch metering layer with 852 stacked output channels, and 2 Swin Transformer Block layers with 852 stacked output channels.

S2.2.3, inputting the calculation result of the CNN-Transformer fusion module into a multilayer perceptron MLP module, wherein the multilayer perceptron MLP module consists of a full connection layer FC, a relu layer, a dropout layer and a softmax layer, and mainly comprises 2 modules consisting of FC + rule + dropout and 1 softmax layer.

In other embodiments, the CNN-Transformer fused product defect classification model may derive two other lightweight classification models, as shown in fig. 8a and 8b, which specifically include:

and 8a, only reserving a fusion step of a gray level co-occurrence matrix GLCM, removing concat connection of a statistic matrix and a main trunk of a gray level co-occurrence matrix fusion structure, wherein the fusion step is to input the processed data into a multi-layer sensor MLP module after the processed data are sequentially processed by a gray level co-occurrence matrix module, a conversion module, a CNN fusion module and a Transformer fusion module.

And 8b, only reserving a fusion step of the statistic matrix and the Transformer fusion module, and removing a structural backbone of the gray level co-occurrence matrix fusion CNN fusion module, wherein the fusion step is to input the processed data into the multi-layer perceptron MLP module after the processed data are processed by the statistic matrix module and the Transformer fusion module in sequence.

In summary, in the embodiment of the present invention, by using the above method for classifying product appearance defects based on the fusion of GLCM and CNN-Transformer, the segmentation threshold for reducing the gray scale dimension is calculated by using an effective pixel probability distribution area equal-dividing method; obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by using a multi-channel regional calculation and combination method; and then, constructing a product defect classification model based on a strategy of fusion of a gray level co-occurrence matrix GLCM (global solution for continuous production) and a statistic matrix with a CNN (CNN) and a transform network structure, and effectively preprocessing a spinning cake sample image.

Specifically, the method comprises the steps of carrying out gray level dimensionality reduction by adopting an effective pixel probability distribution area equal division method, obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by a multi-channel regional calculation and combination method, introducing 3 1 × 1 convolution cores to carry out dimensionality reduction on the gray level co-occurrence matrix GLCM, carrying out feature extraction by using multilayer residual errors CNN, fusing the multilayer residual errors CNN with a Swin transform Block structure, and carrying out secondary feature fusion on fusion results of the statistic matrix and the multilayer residual errors CNN, so that a product defect classification model fused with the GLCM and the CNN-transform is constructed. By combining the prominent expression of the gray level co-occurrence matrix GLCM on the texture feature extraction and the advantages of the CNN and the Transformer on the image classification, the product appearance classification with obvious defect features can be better realized.

In yet another aspect, the present invention also discloses a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of any of the methods described above.

In yet another aspect, the present invention also discloses a computer device comprising a memory and a processor, the memory storing a computer program, the computer program, when executed by the processor, causing the processor to perform the steps of any of the methods as described above.

In a further embodiment provided by the present application, there is also provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of any of the methods of the above embodiments.

It is understood that the system provided by the embodiment of the present invention corresponds to the method provided by the embodiment of the present invention, and the explanation, the example and the beneficial effects of the related contents can refer to the corresponding parts in the method.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A product appearance defect classification method based on fusion of GLCM and CNN-Transformer is characterized by comprising the following steps,

2. The method for classifying product appearance defects based on GLCM and CNN-Transformer fusion as claimed in claim 1, wherein: the sample pretreatment step specifically comprises:

s1.1, obtaining original image sample data of the product appearance;

3. The method for classifying product appearance defects based on GLCM and CNN-Transformer fusion as claimed in claim 1, wherein: the step of calculating may specifically comprise the step of,

s2.1.3, inputting the gray level co-occurrence matrix GLCM of each block image calculated by the multi-channel sub-area gray level co-occurrence matrix calculation module into the multi-channel sub-area gray level co-occurrence matrix statistic calculation module to obtain a statistic matrix.

4. The method for classifying product appearance defects based on GLCM and CNN-Transformer fusion as claimed in claim 1, wherein: the fusion step specifically comprises:

s2.2.2, inputting the statistic matrix obtained in the step S2.1.3 into a Transformer fusion module of a CNN-Transformer fusion module, and inputting the processing result of the step S2.2.1 into the CNN fusion module of the CNN-Transformer fusion module;

s2.2.3, inputting the calculation result of the CNN-Transformer fusion module in the step S2.2.2 into a multi-layer perceptron MLP module.

5. The method of claim 3 for classifying defects in product appearance based on fusion of GLCM and CNN-Transformer, wherein: the method for calculating the multi-channel regional gray level co-occurrence matrix calculation module in step S2.1.2 specifically includes:

、

、

、

And (3) totaling 4 gray level co-occurrence matrixes GLCM in 4 directions, carrying out normalization processing, and combining the gray level co-occurrence matrixes GLCM of each block image in 4 directions, so that each channel image obtains 4 gray level co-occurrence matrixes GLCM, and 3 channel RGB images obtains 12 gray level co-occurrence matrixes GLCM in total.

6. The method of claim 5, wherein the GLCM and CNN-Transformer fusion based product appearance defect classification method comprises: the method for equally dividing the probability distribution area of the effective pixels in S2.1.2.1 specifically includes:

Each division threshold value is

Wherein

Then the following inequality is established

J is an integer, to find the inequality satisfied in each cycle traversal stage

To find 15 division gray level thresholds.

7. The method of claim 3 for classifying defects in product appearance based on fusion of GLCM and CNN-Transformer, wherein: the calculation step of the multi-channel partitioned area GLCM statistic calculation module in step S2.1.3 specifically includes:

8. The method of claim 4 for classifying defects in product appearance based on fusion of GLCM and CNN-Transformer, wherein: the step S2.2.2 is to input the statistic matrix obtained in the step S2.1.3 to a Transformer fusion module of the CNN-Transformer fusion module, and to input the processing result of the step S2.2.1 to a CNN fusion module of the CNN-Transformer fusion module; the method specifically comprises the following steps:

9. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 8.