CN114494254A - Product appearance defect classification method based on fusion of GLCM and CNN-Transformer and storage medium - Google Patents

Product appearance defect classification method based on fusion of GLCM and CNN-Transformer and storage medium Download PDF

Info

Publication number
CN114494254A
CN114494254A CN202210388135.4A CN202210388135A CN114494254A CN 114494254 A CN114494254 A CN 114494254A CN 202210388135 A CN202210388135 A CN 202210388135A CN 114494254 A CN114494254 A CN 114494254A
Authority
CN
China
Prior art keywords
glcm
fusion
cnn
gray level
transformer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210388135.4A
Other languages
Chinese (zh)
Other versions
CN114494254B (en
Inventor
岳晨
黄鑫
裴孝怀
钟智敏
刘伟
王筱圃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hkust Intelligent Internet Of Things Technology Co ltd
Original Assignee
Hkust Intelligent Internet Of Things Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hkust Intelligent Internet Of Things Technology Co ltd filed Critical Hkust Intelligent Internet Of Things Technology Co ltd
Priority to CN202210388135.4A priority Critical patent/CN114494254B/en
Publication of CN114494254A publication Critical patent/CN114494254A/en
Application granted granted Critical
Publication of CN114494254B publication Critical patent/CN114494254B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • G06T7/41Analysis of texture based on statistical description of texture
    • G06T7/45Analysis of texture based on statistical description of texture using co-occurrence matrix computation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention discloses a product appearance defect classification method and a storage medium based on GLCM and CNN-Transformer fusion, wherein the method adopts an effective pixel probability distribution area equal division method to perform gray level dimension reduction on a preprocessed product sample image, obtains a gray level co-occurrence matrix GLCM and a statistic matrix through a multi-channel regional calculation and combination method, introduces 3 1 x 1 convolution kernels as a conversion module to perform dimension reduction on the gray level co-occurrence matrix GLCM, performs feature extraction by using multilayer residual CNN, fuses with a Swin transform Block structure, and performs secondary feature fusion on the fusion result of the statistic matrix and the multilayer residual CNN feature extraction, thereby constructing a product defect classification model fusing the GLCM and the CNN-Transformer. The method combines the outstanding performance of the GLCM on the texture feature extraction and the advantages of the CNN and the Transformer on the image classification, and can better realize the classification of the product appearance with obvious defect features.

Description

Product appearance defect classification method based on fusion of GLCM and CNN-Transformer and storage medium
Technical Field
The invention relates to the technical field of textile chemical fiber industrial detection, in particular to a product appearance defect classification method and a storage medium based on fusion of GLCM and CNN-Transformer.
Background
At present, on product appearance defect's detection, mainly adopt handheld flashlight, the artifical quality control mode that the multi-angle was polished, this kind of detection mode working strength is great, easily receive people's subjective factor influence, inefficiency, consequently, needs adopt more effective mode to accomplish the detection to product appearance defect.
Disclosure of Invention
The invention provides a product appearance defect classification method based on fusion of GLCM and CNN-Transformer, which can solve the technical problems.
In order to realize the purpose, the invention adopts the following technical scheme:
a product appearance defect classification method based on GLCM and CNN-Transformer fusion comprises the following steps,
a sample preprocessing step, wherein the sample preprocessing step comprises the steps of obtaining a product sample image, extracting mask images of an effective area and a maximum outline area, and carrying out AND operation processing to obtain a product sample preprocessing effect image only retaining the mask areas;
calculating, wherein the calculating step comprises the steps of carrying out gray level dimensionality reduction by adopting an effective pixel probability distribution area equal division method, and obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by adopting a multi-channel regional calculation and combination method;
a fusion step, wherein the fusion step comprises the steps of introducing 3 1 × 1 convolution kernels as conversion modules, performing dimensionality reduction treatment on the gray level co-occurrence matrix GLCM, performing feature extraction by using multilayer residual errors CNN, and fusing a feature extraction result and a Swin transform Block structure in a Block-by-Block Merging Patch Merging manner to obtain a fusion result of the gray level co-occurrence matrix GLCM; meanwhile, performing secondary feature fusion on the fusion result of the statistic matrix and the gray level co-occurrence matrix GLCM by adopting a concat connection mode, so as to construct a product defect classification model fused by the gray level co-occurrence matrix GLCM and the CNN-Transformer, and further classify the product appearance graph.
Further, the sample preprocessing step specifically includes:
s1.1, obtaining original image sample data of the product appearance;
s1.2, preprocessing the original image sample data, then obtaining a maximum outline circumscribed rectangle coordinate, and drawing a maximum outline area mask graph;
s1.3, mapping to an original image according to the maximum outline circumscribed rectangle coordinate, and intercepting an effective area to obtain an effective area original image;
and S1.4, performing AND operation on the effective region original image and the maximum outline region mask image, reserving pixel values corresponding to the mask region, and setting other pixel points to be (0,0,0), so as to obtain a pretreatment effect image of the sample.
Further, the calculating step specifically includes,
s2.1.1, obtaining a preprocessed sample graph by using a processing method of the sample preprocessing module;
s2.1.2, inputting the obtained preprocessed sample image into a multi-channel regional gray level co-occurrence matrix calculation module to obtain a gray level co-occurrence matrix GLCM;
s2.1.3, inputting the gray level co-occurrence matrix of each block image obtained by the calculation of the multi-channel sub-area gray level co-occurrence matrix calculation module into the multi-channel sub-area gray level co-occurrence matrix statistic calculation module to obtain a statistic matrix.
Further, the fusing step specifically includes:
s2.2.1, inputting the gray level co-occurrence matrix GLCM obtained by calculation in the step S2.1.2 into a conversion module, and performing feature dimension reduction processing on the input gray level co-occurrence matrix GLCM by adopting 3 1 × 1 convolution kernels;
s2.2.2, inputting the statistic matrix obtained in the step S2.1.3 into a Transformer fusion module of the CNN-Transformer fusion module, and inputting the processing result in the step S2.2.1 into the CNN fusion module of the CNN-Transformer fusion module;
s2.2.3, inputting the calculation result of the CNN-Transformer fusion module into a multi-layer perceptron MLP module.
Further, the method for calculating the multi-channel gray level co-occurrence matrix by regions in step S2.1.2 specifically includes:
s2.1.2.1, dividing the preprocessed sample image into RGB three channels, dividing the gray level co-occurrence matrix GLCM into 16 gray levels, and respectively calculating the segmentation gray level threshold value converted from 256 gray levels into 16 gray levels for R, G, B three channels of the preprocessed sample image by adopting an effective pixel probability distribution area equal division method;
s2.1.2.2, uniformly dividing each channel image after step S2.1.2.1 into 14 × 14 block images, and calculating the position of each block image
Figure 573820DEST_PATH_IMAGE001
Figure 381502DEST_PATH_IMAGE002
Figure 430229DEST_PATH_IMAGE003
Figure 221730DEST_PATH_IMAGE004
The gray level co-occurrence matrixes GLCM in 4 directions are counted and normalized, and the gray level co-occurrence matrixes GLCM of all the block images are combined in 4 directions, so thatAnd each channel image obtains 4 gray level co-occurrence matrixes GLCM, and 3 channel RGB images obtain 12 gray level co-occurrence matrixes GLCM in total.
Further, the method for equally dividing the probability distribution area of the effective pixels in S2.1.2.1 specifically includes:
the method comprises the steps of filtering pixel points with pixel values of (0,0,0) in a preprocessed sample graph, reserving the rest effective pixel points, counting probability distribution of the effective pixel points on 256 gray levels, obtaining a probability distribution graph, dividing the probability distribution graph into 16 areas with equal areas according to the horizontal direction, and calculating each segmentation gray threshold; the calculation method is as follows, the number of effective pixel points on each gray level is assumed to be
Figure 912474DEST_PATH_IMAGE005
Each division threshold value is
Figure 840241DEST_PATH_IMAGE006
Wherein
Figure 325449DEST_PATH_IMAGE007
Then the following inequality is established
Figure 338667DEST_PATH_IMAGE008
J is an integer, to find the inequality satisfied in each cycle traversal stage
Figure 833102DEST_PATH_IMAGE009
To find 15 division gray level thresholds.
Further, the calculating step of the multi-channel partitioned area GLCM statistic calculating module in step S2.1.3 specifically includes:
according to the correlation calculation result of the multi-channel regional gray level co-occurrence matrix calculation module, calculating 14 statistics of GLCM of each block image of each RGB channel, namely: energy, entropy, contrast, uniformity, correlation, variance, sum average, sum variance, sum entropy, difference variance, difference average, difference entropy, correlation information measure and maximum correlation coefficient are respectively combined with statistics of the block images of the RGB channels to obtain 14 statistic matrixes of each channel image, and the 3 channel RGB images are summed to obtain 42 statistic matrixes.
Further, S2.2.2, inputting the statistic matrix obtained in step S2.1.3 to the Transformer fusion module of the CNN-Transformer fusion module, and inputting the processing result of step S2.2.1 to the CNN fusion module of the CNN-Transformer fusion module, specifically including:
performing feature extraction by using a multilayer residual CNN module, fusing a feature extraction result with a Swin transform Block structure in a Patch measuring mode, and performing secondary feature fusion on the statistic matrix obtained in the step S2.1.3 and a fusion result of the multilayer residual CNN feature extraction in a concat connection mode;
the CNN fusion module comprises 1 max-pooling layer, 2 residual error layers with the number of output channels being 96 and composed of convolution kernels of 3 multiplied by 3+1 multiplied by 3+3 multiplied by 1, 1 residual error layer with the number of output channels being 192 and composed of convolution kernels of 3 multiplied by 3+1 multiplied by 3+3 multiplied by 1, and 1 Patch gathering layer with the number of output channels being 384; the Transformer fusion module mainly comprises 4 Swin Transformer Block layers with 426 stacked output channels, 1 Patch metering layer with 852 stacked output channels, and 2 Swin Transformer Block layers with 852 stacked output channels.
Further, the product defect classification model includes the following two lightweight classification models, specifically including:
a. only the fusion step of the gray level co-occurrence matrix GLCM is reserved, concat connection of the statistic matrix and a main stem of a gray level co-occurrence matrix fusion structure is removed, and the fusion step is that the statistic matrix is processed by a gray level co-occurrence matrix module, a conversion module, a CNN fusion module and a Transformer fusion module in sequence and then input into a multi-layer sensor MLP module;
b. and only reserving a fusion step of the statistic matrix and the Transformer fusion module, and removing a structural backbone of the gray level co-occurrence matrix fusion CNN fusion module, wherein the fusion step is to input the processed structural backbone into the multi-layer sensor MLP module after the processed structural backbone is processed by the statistic matrix module and the Transformer fusion module in sequence.
In another aspect, the present invention also discloses a computer readable storage medium storing a computer program, which when executed by a processor causes the processor to perform the steps of the method as described above.
According to the technical scheme, the method for classifying the product appearance defects based on the fusion of the GLCM and the CNN-Transformer adopts an effective pixel probability distribution area equal division method to perform gray level dimension reduction on a preprocessed product appearance sample image, obtains a gray level co-occurrence matrix GLCM and a statistic matrix through a multi-channel regional calculation and combination method, introduces 3 1 x 1 convolution kernels as a conversion module to perform dimension reduction on the gray level co-occurrence matrix GLCM, performs feature extraction by using multilayer residual CNN, fuses with a Swin transform Block structure, performs secondary feature fusion on the fusion result of the statistic matrix and the multilayer residual CNN feature extraction, and accordingly constructs a product defect classification model fusing the GLCM and the CNN-Transformer. The method combines the outstanding performance of the gray level co-occurrence matrix GLCM on the texture feature extraction and the advantages of the CNN and the Transformer on the image classification, and can better realize the classification of the product appearance with obvious defect features.
Drawings
FIG. 1 is a block diagram of the present invention;
FIG. 2a is a sample diagram of a non-formed cake according to an embodiment of the present invention;
FIG. 2b is a drawing of a typical cake forming sample of an embodiment of the present invention;
FIG. 3a is a graph of the effect of pretreatment of a non-formed cake sample according to an embodiment of the present invention;
FIG. 3b is a graph of the effect of pretreatment of a typical cake forming sample according to an embodiment of the present invention;
FIG. 4 is a flow diagram of a calculation module of an embodiment of the invention;
FIG. 5 is a block diagram of the non-shaped sample shown in FIG. 3b, which is divided into 14 × 14 block images according to an embodiment of the present invention;
FIG. 6 is a Block diagram of Swin Transformer Block according to an embodiment of the present invention;
FIG. 7 is a flow diagram of a fusion module according to an embodiment of the present invention;
FIG. 8a is a flow chart of a fusion module derived from an embodiment of the present invention;
FIG. 8b is a flow chart of another fusion module derived from embodiments of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention.
The poor forming of the spinning cake is the most common defect type in appearance defects of the spinning cake, and the most prominent characteristic of the forming defects is the abnormity of obvious running traces of the spinning cake on the surface of the spinning cake, which seriously influences the quality of the spinning cake and is the defect which needs to be detected. Therefore, the method for classifying product appearance defects based on the fusion of GLCM and CNN-Transformer in this embodiment is specifically a method for classifying spinning cake appearance molding based on the fusion of GLCM and CNN-Transformer, which replaces an automatic method for classifying spinning cake molding defects by manual quality inspection, and specifically includes the following steps:
as shown in fig. 1, the method for classifying appearance of spinning cake based on fusion of GLCM and CNN-Transformer in this embodiment includes,
a sample preprocessing step, wherein the sample preprocessing step comprises the steps of obtaining a spinning cake sample picture, extracting mask pictures of an effective area and a maximum outline area, and carrying out AND operation processing to obtain a spinning cake sample preprocessing effect picture only keeping the mask areas;
calculating, wherein the calculating step comprises the steps of carrying out gray level dimensionality reduction by adopting an effective pixel probability distribution area equal division method, and obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by adopting a multi-channel regional calculation and combination method;
the fusion step comprises the steps of introducing 3 1 × 1 convolution kernels as conversion modules, performing dimensionality reduction on the gray level co-occurrence matrix GLCM, performing feature extraction by using multilayer residual errors CNN, and fusing a feature extraction result and a Swin transform Block structure in a Block Merging Patch Merging mode to obtain a fusion result of the gray level co-occurrence matrix GLCM; meanwhile, a concat connection mode is adopted, secondary feature fusion is carried out on a fusion result of the statistic matrix and the gray level co-occurrence matrix GLCM, so that a product defect classification model fused with the gray level co-occurrence matrix GLCM and the CNN-Transformer is constructed, and then the appearance pattern of the spinning cake is classified.
The following are specifically described:
s1, a sample preprocessing module: and (4) preprocessing the spinning cake sample image to obtain a sample preprocessing effect image only retaining the mask area.
S1.1, obtaining original image sample data of the right upper end face of a spinning cake;
as shown in fig. 2, a raw plot of a typical cake-formed sample versus an unformed sample is listed.
S1.2, preprocessing original image sample data, then obtaining circumscribed rectangular coordinates of a maximum outline, and drawing a mask graph of a maximum outline area;
for the treatment of this step, the following method can be employed: performing image graying processing, solving a segmentation threshold value by using a maximum inter-class variance method, converting the segmentation threshold value into a binary image, performing open operation processing on the binary image, removing isolated tiny noise points, extracting an image contour, solving a circumscribed rectangle of the maximum contour, and drawing a mask image of a maximum contour region.
S1.3, mapping to an original image according to the solved maximum outline external rectangular coordinate, and intercepting an effective area to obtain an effective area original image;
and S1.4, performing AND operation processing on the obtained effective region original image and the obtained maximum outline region mask image, reserving pixel values corresponding to the mask region, and setting other pixel points to be (0,0,0), so as to obtain a preprocessing effect image of the sample.
As shown in fig. 3a and 3b, methods adopted by the sample preprocessing module according to the present invention are shown, which are sample preprocessing effects finally obtained after processing the original images shown in fig. 2a and 2b, respectively.
S2, a classification module: the classification module comprises a calculation module and a fusion module 2.
S2.1, a calculation module: as shown in fig. 4, a specific process of the calculation module is shown, a gray scale dimensionality reduction is performed by adopting an effective pixel probability distribution area equal division method, and a gray level co-occurrence matrix GLCM and a statistic matrix are obtained by a multi-channel regional calculation and combination method.
S2.1.1, obtaining a preprocessed sample graph by using a processing method of the sample preprocessing module;
as shown in fig. 3, a diagram of a preprocessed sample is obtained by the sample preprocessing module.
S2.1.2, inputting the obtained preprocessed sample image into a multi-channel regional gray level co-occurrence matrix calculation module to obtain a gray level co-occurrence matrix;
the method for computing the GLCM of the multi-channel regional gray level co-occurrence matrix specifically comprises the following steps:
s2.1.2.1, dividing the preprocessed sample image into RGB three channels, dividing the gray level co-occurrence matrix GLCM into 16 gray levels, and respectively calculating the segmentation gray level threshold value converted from 256 gray levels into 16 gray levels for R, G, B three channels of the preprocessed sample image by adopting an effective pixel probability distribution area equal division method;
the method for equally dividing the probability distribution area of the effective pixel specifically comprises the following steps: the method comprises the steps of filtering pixel points with pixel values of (0,0,0) in a preprocessed sample graph, reserving the rest effective pixel points, counting probability distribution of the effective pixel points on 256 gray levels, obtaining a probability distribution graph, dividing the probability distribution graph into 16 areas with equal areas according to the horizontal direction, and calculating each segmentation gray threshold; the calculation method is as follows, the number of effective pixel points on each gray level is assumed to be
Figure 615375DEST_PATH_IMAGE010
Each division threshold value is
Figure 678009DEST_PATH_IMAGE011
Wherein
Figure 240840DEST_PATH_IMAGE007
Then the following inequality is established
Figure 883174DEST_PATH_IMAGE012
J is an integer, and the inequality is satisfied in each cycle traversal stageIs/are as follows
Figure 221751DEST_PATH_IMAGE013
To find 15 division gray level thresholds.
S2.1.2.2, uniformly dividing each channel image after step S2.1.2.1 into 14 × 14 block images, and calculating the position of each block image
Figure 967921DEST_PATH_IMAGE001
Figure 532894DEST_PATH_IMAGE002
Figure 572395DEST_PATH_IMAGE014
Figure 391577DEST_PATH_IMAGE015
And 4 gray level co-occurrence matrixes GLCM in 4 directions are counted and normalized, and the gray level co-occurrence matrixes of all the block images are combined in 4 directions, so that 4 gray level co-occurrence matrixes GLCM are obtained from each channel image, and 12 gray level co-occurrence matrixes GLCM are obtained from 3 channel RGB images.
As shown in FIG. 5, the pre-processing effect map of the non-shaped sample shown in FIG. 3b is divided into 14 × 14 block images, and each block image is calculated
Figure 733697DEST_PATH_IMAGE016
Figure 769655DEST_PATH_IMAGE017
Figure 957054DEST_PATH_IMAGE018
Figure 37267DEST_PATH_IMAGE019
And (3) combining the gray level co-occurrence matrixes GLCM in 4 directions to obtain 12 gray level co-occurrence matrixes GLCM with the size of 224x 224.
S2.1.3, inputting the gray level co-occurrence matrix of each block image obtained by the calculation of the multi-channel sub-area gray level co-occurrence matrix calculation module into the multi-channel sub-area gray level co-occurrence matrix statistic calculation module to obtain a statistic matrix.
The multi-channel regional GLCM statistic calculation module specifically comprises: according to the correlation calculation result of the multi-channel regional gray level co-occurrence matrix calculation module, calculating 14 statistics of GLCM of each block image of each RGB channel, namely: energy, entropy, contrast, uniformity, correlation, variance, sum average, sum variance, sum entropy, difference variance, difference average, difference entropy, correlation information measure and maximum correlation coefficient are respectively combined with statistics of the block images of the RGB channels to obtain 14 statistic matrixes of each channel image, and the 3 channel RGB images are summed to obtain 42 statistic matrixes.
As shown in fig. 4, a total of 42 statistical matrices of size 14 × 14 are finally obtained.
S2.2, a fusion module: as shown in fig. 7, a specific flow of the fusion module is shown, 3 1 × 1 convolution kernels are introduced as a conversion module, after the dimension reduction processing is performed on the gray level co-occurrence matrix GLCM, the multi-layer residual error CNN is used for feature extraction, the feature extraction result is fused with the Swin transform Block (shown in fig. 6) structure body in a Patch gathering manner, and meanwhile, the statistic matrix and the fusion result of the multi-layer residual error CNN feature extraction are subjected to secondary feature fusion in a concat connection manner, so that a product defect classification model fused with the GLCM and the multi-layer residual error CNN-transform is constructed.
S2.2.1, inputting the calculated gray level co-occurrence matrix GLCM into the conversion module.
The conversion module adopts 3 1 × 1 convolution kernels to perform feature dimension reduction processing on the input gray level co-occurrence matrix GLCM, so that the calculation amount in the CNN fusion module is reduced.
S2.2.2, inputting the obtained statistic matrix into a Transformer fusion module, and inputting the obtained gray level co-occurrence matrix GLCM into a CNN fusion module through a conversion module, wherein the CNN-Transformer fusion module comprises a CNN fusion module and a Transformer fusion module.
The CNN-Transformer fusion module utilizes multilayer residual CNN to extract features, fuses the feature extraction results with a Swin Transformer Block structure in a Patch measuring mode, and simultaneously performs secondary feature fusion on the statistic matrix and the fusion results of the multilayer residual CNN feature extraction in a concat connection mode. As shown in fig. 7, the CNN fusion module mainly includes 1 max-pooling layer, 2 residual layers with output channels of 96 and composed of 3 × 3+1 × 3+3 × 1 convolution kernels, 1 residual layer with output channels of 192 and composed of 3 × 3+1 × 3+3 × 1 convolution kernels, and 1 Patch gathering layer with output channels of 384. The Transformer fusion module mainly comprises 4 Swin Transformer Block layers with 426 stacked output channels, 1 Patch metering layer with 852 stacked output channels, and 2 Swin Transformer Block layers with 852 stacked output channels.
S2.2.3, inputting the calculation result of the CNN-Transformer fusion module into a multilayer perceptron MLP module, wherein the multilayer perceptron MLP module consists of a full connection layer FC, a relu layer, a dropout layer and a softmax layer, and mainly comprises 2 modules consisting of FC + rule + dropout and 1 softmax layer.
In other embodiments, the CNN-Transformer fused product defect classification model may derive two other lightweight classification models, as shown in fig. 8a and 8b, which specifically include:
and 8a, only reserving a fusion step of a gray level co-occurrence matrix GLCM, removing concat connection of a statistic matrix and a main trunk of a gray level co-occurrence matrix fusion structure, wherein the fusion step is to input the processed data into a multi-layer sensor MLP module after the processed data are sequentially processed by a gray level co-occurrence matrix module, a conversion module, a CNN fusion module and a Transformer fusion module.
And 8b, only reserving a fusion step of the statistic matrix and the Transformer fusion module, and removing a structural backbone of the gray level co-occurrence matrix fusion CNN fusion module, wherein the fusion step is to input the processed data into the multi-layer perceptron MLP module after the processed data are processed by the statistic matrix module and the Transformer fusion module in sequence.
In summary, in the embodiment of the present invention, by using the above method for classifying product appearance defects based on the fusion of GLCM and CNN-Transformer, the segmentation threshold for reducing the gray scale dimension is calculated by using an effective pixel probability distribution area equal-dividing method; obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by using a multi-channel regional calculation and combination method; and then, constructing a product defect classification model based on a strategy of fusion of a gray level co-occurrence matrix GLCM (global solution for continuous production) and a statistic matrix with a CNN (CNN) and a transform network structure, and effectively preprocessing a spinning cake sample image.
Specifically, the method comprises the steps of carrying out gray level dimensionality reduction by adopting an effective pixel probability distribution area equal division method, obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by a multi-channel regional calculation and combination method, introducing 3 1 × 1 convolution cores to carry out dimensionality reduction on the gray level co-occurrence matrix GLCM, carrying out feature extraction by using multilayer residual errors CNN, fusing the multilayer residual errors CNN with a Swin transform Block structure, and carrying out secondary feature fusion on fusion results of the statistic matrix and the multilayer residual errors CNN, so that a product defect classification model fused with the GLCM and the CNN-transform is constructed. By combining the prominent expression of the gray level co-occurrence matrix GLCM on the texture feature extraction and the advantages of the CNN and the Transformer on the image classification, the product appearance classification with obvious defect features can be better realized.
In yet another aspect, the present invention also discloses a computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of any of the methods described above.
In yet another aspect, the present invention also discloses a computer device comprising a memory and a processor, the memory storing a computer program, the computer program, when executed by the processor, causing the processor to perform the steps of any of the methods as described above.
In a further embodiment provided by the present application, there is also provided a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of any of the methods of the above embodiments.
It is understood that the system provided by the embodiment of the present invention corresponds to the method provided by the embodiment of the present invention, and the explanation, the example and the beneficial effects of the related contents can refer to the corresponding parts in the method.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A product appearance defect classification method based on fusion of GLCM and CNN-Transformer is characterized by comprising the following steps,
a sample preprocessing step, wherein the sample preprocessing step comprises the steps of obtaining a product sample image, extracting mask images of an effective area and a maximum outline area, and carrying out AND operation processing to obtain a product sample preprocessing effect image only retaining the mask areas;
calculating, wherein the calculating step comprises the steps of carrying out gray level dimensionality reduction by adopting an effective pixel probability distribution area equal division method, and obtaining a gray level co-occurrence matrix GLCM and a statistic matrix by adopting a multi-channel regional calculation and combination method;
a fusion step, wherein the fusion step comprises the steps of introducing 3 1 × 1 convolution kernels as conversion modules, performing dimensionality reduction treatment on the gray level co-occurrence matrix GLCM, performing feature extraction by using multilayer residual errors CNN, and fusing a feature extraction result and a Swin transform Block structure in a Block-by-Block Merging Patch Merging manner to obtain a fusion result of the gray level co-occurrence matrix GLCM; meanwhile, performing secondary feature fusion on the fusion result of the statistic matrix and the gray level co-occurrence matrix GLCM by adopting a concat connection mode, so as to construct a product defect classification model fused by the gray level co-occurrence matrix GLCM and the CNN-Transformer, and further classify the product appearance graph.
2. The method for classifying product appearance defects based on GLCM and CNN-Transformer fusion as claimed in claim 1, wherein: the sample pretreatment step specifically comprises:
s1.1, obtaining original image sample data of the product appearance;
s1.2, preprocessing the original image sample data, then obtaining a maximum outline circumscribed rectangle coordinate, and drawing a maximum outline area mask graph;
s1.3, mapping to an original image according to the maximum outline circumscribed rectangle coordinate, and intercepting an effective area to obtain an effective area original image;
and S1.4, performing AND operation on the effective region original image and the maximum outline region mask image, reserving pixel values corresponding to the mask region, and setting other pixel points to be (0,0,0), so as to obtain a pretreatment effect image of the sample.
3. The method for classifying product appearance defects based on GLCM and CNN-Transformer fusion as claimed in claim 1, wherein: the step of calculating may specifically comprise the step of,
s2.1.1, obtaining a preprocessed sample graph by using a processing method of the sample preprocessing module;
s2.1.2, inputting the obtained preprocessed sample image into a multi-channel regional gray level co-occurrence matrix calculation module to obtain a gray level co-occurrence matrix GLCM;
s2.1.3, inputting the gray level co-occurrence matrix GLCM of each block image calculated by the multi-channel sub-area gray level co-occurrence matrix calculation module into the multi-channel sub-area gray level co-occurrence matrix statistic calculation module to obtain a statistic matrix.
4. The method for classifying product appearance defects based on GLCM and CNN-Transformer fusion as claimed in claim 1, wherein: the fusion step specifically comprises:
s2.2.1, inputting the gray level co-occurrence matrix GLCM obtained by calculation in the step S2.1.2 into a conversion module, and performing feature dimension reduction processing on the input gray level co-occurrence matrix GLCM by adopting 3 1 × 1 convolution kernels;
s2.2.2, inputting the statistic matrix obtained in the step S2.1.3 into a Transformer fusion module of a CNN-Transformer fusion module, and inputting the processing result of the step S2.2.1 into the CNN fusion module of the CNN-Transformer fusion module;
s2.2.3, inputting the calculation result of the CNN-Transformer fusion module in the step S2.2.2 into a multi-layer perceptron MLP module.
5. The method of claim 3 for classifying defects in product appearance based on fusion of GLCM and CNN-Transformer, wherein: the method for calculating the multi-channel regional gray level co-occurrence matrix calculation module in step S2.1.2 specifically includes:
s2.1.2.1, dividing the preprocessed sample image into RGB three channels, dividing the gray level co-occurrence matrix GLCM into 16 gray levels, and respectively calculating the segmentation gray level threshold value converted from 256 gray levels into 16 gray levels for R, G, B three channels of the preprocessed sample image by adopting an effective pixel probability distribution area equal division method;
s2.1.2.2, uniformly dividing each channel image after step S2.1.2.1 into 14 × 14 block images, and calculating the position of each block image
Figure DEST_PATH_IMAGE001
Figure 833523DEST_PATH_IMAGE002
Figure DEST_PATH_IMAGE003
Figure 94871DEST_PATH_IMAGE004
And (3) totaling 4 gray level co-occurrence matrixes GLCM in 4 directions, carrying out normalization processing, and combining the gray level co-occurrence matrixes GLCM of each block image in 4 directions, so that each channel image obtains 4 gray level co-occurrence matrixes GLCM, and 3 channel RGB images obtains 12 gray level co-occurrence matrixes GLCM in total.
6. The method of claim 5, wherein the GLCM and CNN-Transformer fusion based product appearance defect classification method comprises: the method for equally dividing the probability distribution area of the effective pixels in S2.1.2.1 specifically includes:
the method comprises the steps of filtering pixel points with pixel values of (0,0,0) in a preprocessed sample graph, reserving the rest effective pixel points, counting probability distribution of the effective pixel points on 256 gray levels, obtaining a probability distribution graph, dividing the probability distribution graph into 16 areas with equal areas according to the horizontal direction, and calculating each segmentation gray threshold; the calculation method is as follows, the number of effective pixel points on each gray level is assumed to be
Figure DEST_PATH_IMAGE005
Each division threshold value is
Figure 343450DEST_PATH_IMAGE006
Wherein
Figure DEST_PATH_IMAGE007
Then the following inequality is established
Figure 911480DEST_PATH_IMAGE008
J is an integer, to find the inequality satisfied in each cycle traversal stage
Figure DEST_PATH_IMAGE009
To find 15 division gray level thresholds.
7. The method of claim 3 for classifying defects in product appearance based on fusion of GLCM and CNN-Transformer, wherein: the calculation step of the multi-channel partitioned area GLCM statistic calculation module in step S2.1.3 specifically includes:
according to the correlation calculation result of the multi-channel regional gray level co-occurrence matrix calculation module, calculating 14 statistics of GLCM of each block image of each RGB channel, namely: energy, entropy, contrast, uniformity, correlation, variance, sum average, sum variance, sum entropy, difference variance, difference average, difference entropy, correlation information measure and maximum correlation coefficient are respectively combined with statistics of the block images of the RGB channels to obtain 14 statistic matrixes of each channel image, and the 3 channel RGB images are summed to obtain 42 statistic matrixes.
8. The method of claim 4 for classifying defects in product appearance based on fusion of GLCM and CNN-Transformer, wherein: the step S2.2.2 is to input the statistic matrix obtained in the step S2.1.3 to a Transformer fusion module of the CNN-Transformer fusion module, and to input the processing result of the step S2.2.1 to a CNN fusion module of the CNN-Transformer fusion module; the method specifically comprises the following steps:
performing feature extraction by using a multilayer residual CNN module, fusing a feature extraction result with a Swin transform Block structure in a Patch measuring mode, and performing secondary feature fusion on the statistic matrix obtained in the step S2.1.3 and a fusion result of the multilayer residual CNN feature extraction in a concat connection mode;
the CNN fusion module comprises 1 max-pooling layer, 2 residual error layers with the number of output channels being 96 and composed of convolution kernels of 3 multiplied by 3+1 multiplied by 3+3 multiplied by 1, 1 residual error layer with the number of output channels being 192 and composed of convolution kernels of 3 multiplied by 3+1 multiplied by 3+3 multiplied by 1, and 1 Patch gathering layer with the number of output channels being 384; the Transformer fusion module mainly comprises 4 Swin Transformer Block layers with 426 stacked output channels, 1 Patch metering layer with 852 stacked output channels, and 2 Swin Transformer Block layers with 852 stacked output channels.
9. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 8.
CN202210388135.4A 2022-04-14 2022-04-14 GLCM and CNN-Transformer fusion-based product appearance defect classification method and storage medium Active CN114494254B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210388135.4A CN114494254B (en) 2022-04-14 2022-04-14 GLCM and CNN-Transformer fusion-based product appearance defect classification method and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210388135.4A CN114494254B (en) 2022-04-14 2022-04-14 GLCM and CNN-Transformer fusion-based product appearance defect classification method and storage medium

Publications (2)

Publication Number Publication Date
CN114494254A true CN114494254A (en) 2022-05-13
CN114494254B CN114494254B (en) 2022-07-05

Family

ID=81487990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210388135.4A Active CN114494254B (en) 2022-04-14 2022-04-14 GLCM and CNN-Transformer fusion-based product appearance defect classification method and storage medium

Country Status (1)

Country Link
CN (1) CN114494254B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115239712A (en) * 2022-09-21 2022-10-25 季华实验室 Circuit board surface defect detection method and device, electronic equipment and storage medium
CN115795683A (en) * 2022-12-08 2023-03-14 四川大学 Wing profile optimization method fusing CNN and Swin transform network

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3582142A1 (en) * 2018-06-15 2019-12-18 Université de Liège Image classification using neural networks
CN112001909A (en) * 2020-08-26 2020-11-27 北京科技大学 Powder bed defect visual detection method based on image feature fusion
CN112686086A (en) * 2019-10-19 2021-04-20 中国科学院空天信息创新研究院 Crop classification method based on optical-SAR (synthetic aperture radar) cooperative response
US20210183484A1 (en) * 2019-12-06 2021-06-17 Surgical Safety Technologies Inc. Hierarchical cnn-transformer based machine learning
CN113887487A (en) * 2021-10-20 2022-01-04 河海大学 Facial expression recognition method and device based on CNN-Transformer
CN113951834A (en) * 2021-11-30 2022-01-21 湖南应超智能计算研究院有限责任公司 Alzheimer disease classification prediction method based on visual Transformer algorithm
CN113989228A (en) * 2021-10-27 2022-01-28 西安工程大学 Method for detecting defect area of color texture fabric based on self-attention
CN114066902A (en) * 2021-11-22 2022-02-18 安徽大学 Medical image segmentation method, system and device based on convolution and transformer fusion
CN114066820A (en) * 2021-10-26 2022-02-18 武汉纺织大学 Fabric defect detection method based on Swin-transducer and NAS-FPN
CN114299065A (en) * 2022-03-03 2022-04-08 科大智能物联技术股份有限公司 Method for detecting and grading defective appearance forming defects of silk ingots, storage medium and equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3582142A1 (en) * 2018-06-15 2019-12-18 Université de Liège Image classification using neural networks
CN112686086A (en) * 2019-10-19 2021-04-20 中国科学院空天信息创新研究院 Crop classification method based on optical-SAR (synthetic aperture radar) cooperative response
US20210183484A1 (en) * 2019-12-06 2021-06-17 Surgical Safety Technologies Inc. Hierarchical cnn-transformer based machine learning
CN112001909A (en) * 2020-08-26 2020-11-27 北京科技大学 Powder bed defect visual detection method based on image feature fusion
CN113887487A (en) * 2021-10-20 2022-01-04 河海大学 Facial expression recognition method and device based on CNN-Transformer
CN114066820A (en) * 2021-10-26 2022-02-18 武汉纺织大学 Fabric defect detection method based on Swin-transducer and NAS-FPN
CN113989228A (en) * 2021-10-27 2022-01-28 西安工程大学 Method for detecting defect area of color texture fabric based on self-attention
CN114066902A (en) * 2021-11-22 2022-02-18 安徽大学 Medical image segmentation method, system and device based on convolution and transformer fusion
CN113951834A (en) * 2021-11-30 2022-01-21 湖南应超智能计算研究院有限责任公司 Alzheimer disease classification prediction method based on visual Transformer algorithm
CN114299065A (en) * 2022-03-03 2022-04-08 科大智能物联技术股份有限公司 Method for detecting and grading defective appearance forming defects of silk ingots, storage medium and equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZE LIU 等: "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows", 《ARXIV:2103.14030V2 [CS.CV]》 *
高升等: "基于高光谱图像信息融合的红提糖度无损检测", 《发光学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115239712A (en) * 2022-09-21 2022-10-25 季华实验室 Circuit board surface defect detection method and device, electronic equipment and storage medium
CN115795683A (en) * 2022-12-08 2023-03-14 四川大学 Wing profile optimization method fusing CNN and Swin transform network
CN115795683B (en) * 2022-12-08 2023-07-21 四川大学 Airfoil optimization method integrating CNN and Swin converter network

Also Published As

Publication number Publication date
CN114494254B (en) 2022-07-05

Similar Documents

Publication Publication Date Title
CN114494254B (en) GLCM and CNN-Transformer fusion-based product appearance defect classification method and storage medium
CN109543627B (en) Method and device for judging driving behavior category and computer equipment
CN110414507B (en) License plate recognition method and device, computer equipment and storage medium
EP3872650A1 (en) Method for footprint image retrieval
CN109377445B (en) Model training method, method and device for replacing image background and electronic system
CN110414538B (en) Defect classification method, defect classification training method and device thereof
CN111723860A (en) Target detection method and device
CN111445459B (en) Image defect detection method and system based on depth twin network
WO2022236876A1 (en) Cellophane defect recognition method, system and apparatus, and storage medium
CN111368758A (en) Face ambiguity detection method and device, computer equipment and storage medium
CN114897816A (en) Mask R-CNN mineral particle identification and particle size detection method based on improved Mask
CN115239946B (en) Small sample transfer learning training and target detection method, device, equipment and medium
CN115829995A (en) Cloth flaw detection method and system based on pixel-level multi-scale feature fusion
CN115239672A (en) Defect detection method and device, equipment and storage medium
CN117474863A (en) Chip surface defect detection method for compressed multi-head self-attention neural network
CN111666949A (en) Image semantic segmentation method based on iterative segmentation
CN116542962A (en) Improved Yolov5m model-based photovoltaic cell defect detection method
CN110992301A (en) Gas contour identification method
CN116977239A (en) Defect detection method, device, computer equipment and storage medium
CN109949245B (en) Cross laser detection positioning method and device, storage medium and computer equipment
CN113392916A (en) Method and system for detecting nutritional ingredients of bamboo shoots based on hyperspectral image and storage medium
CN114120053A (en) Image processing method, network model training method and device and electronic equipment
CN112816408B (en) Flaw detection method for optical lens
CN117437221B (en) Method and system for detecting bright decorative strip based on image detection
CN116503888B (en) Method, system and storage medium for extracting form from image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant