CN116645365A

CN116645365A - Quartz glass detection method, device, equipment and medium based on frequency spectrum

Info

Publication number: CN116645365A
Application number: CN202310897267.4A
Authority: CN
Inventors: 何良雨; 崔健; 刘彤
Original assignee: Fengrui Lingchuang Zhuhai Technology Co ltd
Current assignee: Fengrui Lingchuang Zhuhai Technology Co ltd
Priority date: 2023-07-21
Filing date: 2023-07-21
Publication date: 2023-08-25
Anticipated expiration: 2043-07-21
Also published as: CN116645365B

Abstract

The application relates to the technical field of artificial intelligence, in particular to a quartz glass detection method, device, equipment and medium based on frequency spectrum. Performing convolution feature extraction on a quartz glass image to be detected to obtain a first convolution feature image, performing convolution feature extraction on the first convolution feature image, performing image enhancement on the first convolution feature image twice, fusing the enhanced feature image, extracting a high-frequency feature image and a low-frequency feature image in the first convolution feature image, calculating the association degree between the high-frequency feature image and the low-frequency feature image, performing enhancement fusion on the first convolution feature image according to the association degree to obtain a fused feature image, and detecting the fused feature image to obtain a detection result. According to the correlation degree of different spectrum signal characteristic graphs in the quartz glass image, the difference of the different spectrum signal characteristic graphs is enhanced, and the accuracy of quartz glass detection is further improved.

Description

Quartz glass detection method, device, equipment and medium based on frequency spectrum

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a quartz glass detection method, device, equipment and medium based on frequency spectrum.

Background

Quartz glass is known as glass king, is a special glass material composed of a single silicon dioxide component, has stable physical and chemical properties, has the advantages of high temperature resistance, corrosion resistance, strong light transmittance, low expansion coefficient, good insulativity, good vacuum property and the like, and is widely applied to high and new technical fields of semiconductors, photovoltaics, optics, optical communication, aerospace and the like. In the process of manufacturing quartz glass, process defects on the surface and in the interior of the quartz glass are required to be strictly controlled to ensure the quality of the quartz glass, especially in the high-end application scenes of semiconductors such as photoetching mask quartz glass substrates, crystal pulling quartz glass crucibles and the like, the purity and performance requirements of the quartz glass are extremely high, and the process defects and impurity content of the used quartz glass can be detected.

In the prior art, when quartz glass is detected, a visual detection method is generally used, and as quartz glass defects may have different shapes, sizes and distributions, tiny defects are not prominent compared with background textures, so that the detection difficulty is increased, and the detection accuracy is lower, therefore, how to improve the detection accuracy in the quartz glass defect detection process becomes a problem to be solved.

Disclosure of Invention

In view of the above, the embodiments of the present application provide a method, an apparatus, and a medium for detecting quartz glass based on frequency spectrum, so as to solve the problem of low detection accuracy when detecting quartz glass.

In a first aspect, an embodiment of the present application provides a method for detecting quartz glass based on a frequency spectrum, where the method includes:

acquiring a quartz glass image to be detected, and carrying out convolution feature extraction on the quartz glass image to be detected to obtain a first convolution feature map;

performing convolution feature extraction on the first convolution feature map to obtain a second convolution feature map, performing first downsampling on the second convolution feature map to obtain a first enhancement feature map, performing second downsampling on the first enhancement feature map to obtain a second enhancement feature map, and fusing the first enhancement feature map and the second enhancement feature map to obtain a fused enhancement feature map;

extracting a high-frequency characteristic diagram and a low-frequency characteristic diagram in the first convolution characteristic diagram, and calculating the association degree between the high-frequency characteristic diagram and the low-frequency characteristic diagram to obtain an association degree matrix;

and performing activation treatment on the association degree matrix to obtain a weight activation matrix, using the weight activation matrix to enhance the fusion enhancement feature map to obtain an enhancement feature map, fusing the enhancement feature map with the first convolution feature map to obtain a fusion feature map, and detecting the fusion feature map to obtain a detection result.

In a second aspect, an embodiment of the present application provides a spectrum-based quartz glass detection apparatus, including:

the acquisition module is used for acquiring a quartz glass image to be detected, and carrying out convolution feature extraction on the quartz glass image to be detected to obtain a first convolution feature map;

the enhancement module is used for carrying out convolution feature extraction on the convolution feature images to obtain second convolution feature images, carrying out channel dimension enhancement on the second convolution feature images to obtain first enhancement feature images, carrying out space dimension enhancement on the first enhancement feature images to obtain second enhancement feature images, and fusing the first enhancement feature images and the second enhancement feature images to obtain fused enhancement feature images;

the extraction module is used for extracting a high-frequency characteristic diagram of the high-frequency signal and a low-frequency characteristic diagram of the low-frequency signal in the first convolution characteristic diagram, and calculating the association degree between the high-frequency characteristic diagram and the low-frequency characteristic diagram to obtain an association degree matrix;

the detection module is used for carrying out activation processing on the association degree matrix to obtain a weight activation matrix, using the weight activation matrix to strengthen the fusion enhancement feature map to obtain an enhancement feature map, fusing the enhancement feature map with the first convolution feature map to obtain a fusion feature map, and detecting the fusion feature map to obtain a detection result.

In a third aspect, an embodiment of the present application provides a terminal device, where the terminal device includes a processor, a memory, and a computer program stored in the memory and executable on the processor, and where the processor implements the quartz glass detection method according to the first aspect when executing the computer program.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the quartz glass detection method according to the first aspect.

Compared with the prior art, the application has the beneficial effects that:

obtaining a quartz glass image to be detected, carrying out convolution feature extraction on the quartz glass image to be detected to obtain a first convolution feature image, carrying out convolution feature extraction on the first convolution feature image to obtain a second convolution feature image, carrying out first downsampling on the second convolution feature image to obtain a first enhancement feature image, carrying out second downsampling on the first enhancement feature image to obtain a second enhancement feature image, fusing the first enhancement feature image with the second enhancement feature image to obtain a fused enhancement feature image, extracting a high-frequency feature image and a low-frequency feature image in the first convolution feature image, calculating the degree of association between the high-frequency feature image and the low-frequency feature image to obtain an association degree matrix, carrying out activation processing on the association degree matrix to obtain a weight activation matrix, enhancing the fused enhancement feature image by using the weight activation matrix to obtain an enhancement feature image, fusing the enhancement feature image with the first convolution feature image to obtain a fused feature image, and detecting the fused feature image to obtain a detection result. According to the correlation degree of different spectrum signal characteristic graphs in the quartz glass image, the difference of the different spectrum signal characteristic graphs is enhanced, and the accuracy of quartz glass detection is further improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments of the present application will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic view of an application environment of a spectrum-based quartz glass detection method according to a first embodiment of the present application;

fig. 2 is a schematic flow chart of a method for detecting quartz glass based on frequency spectrum according to a first embodiment of the present application;

fig. 3 is a schematic flow chart of a method for detecting quartz glass based on frequency spectrum according to a second embodiment of the present application;

FIG. 4 is a graph showing the comparison of the detection effect of a spectrum-based quartz glass detection method and other defect detection models on an original image obtained by a high-definition industrial camera according to the third embodiment of the present application;

fig. 5 is a block diagram of a quartz glass detection device based on spectrum according to a fourth embodiment of the present application.

Fig. 6 is a schematic structural diagram of a terminal device according to a fifth embodiment of the present application.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.

As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".

Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.

Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the invention. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

The embodiment of the invention can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

It should be understood that the sequence numbers of the steps in the following embodiments do not mean the order of execution, and the execution order of the processes should be determined by the functions and the internal logic, and should not be construed as limiting the implementation process of the embodiments of the present invention.

In order to illustrate the technical scheme of the invention, the following description is made by specific examples.

The method for detecting quartz glass based on frequency spectrum provided by the embodiment of the invention can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server. The client includes, but is not limited to, a handheld computer, a desktop computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a personal digital assistant (personal digital assistant, PDA), and other terminal devices. The server may be implemented as a stand-alone server or as a cluster of servers generated by multiple servers.

Referring to fig. 2, a flowchart of a spectrum-based quartz glass detection method according to an embodiment of the present invention is shown, where the spectrum-based quartz glass detection method may be applied to a server in fig. 1, and the server is connected to a corresponding client, and as shown in fig. 2, the spectrum-based quartz glass detection method may include the following steps.

S201: and acquiring a quartz glass image to be detected, and carrying out convolution feature extraction on the quartz glass image to be detected to obtain a first convolution feature map.

In step S201, a quartz glass image to be detected is acquired, which may be an optical feature image, and convolution feature extraction is performed on the quartz glass image to be detected through convolution operation, so as to obtain a first convolution feature map.

In this embodiment, a quartz glass image to be detected is obtained by photographing quartz glass to be detected with an RGB camera, wherein the size of the quartz glass image to be detected is c×h×w, C is the number of channels of the first image and the second image, H is the heights of the first image and the second image, and W is the widths of the first image and the second image. Inputting the quartz glass image to be detected into a 3×3 convolution layer, and performing convolution processing on the quartz glass image to be detected to obtain a first convolution characteristic diagram.

S202: and performing convolution feature extraction on the first convolution feature map to obtain a second convolution feature map, performing first downsampling on the second convolution feature map to obtain a first enhancement feature map, performing second downsampling on the second convolution feature map to obtain a second enhancement feature map, and fusing the first enhancement feature map and the second enhancement feature map to obtain a fused enhancement feature map.

In step S202, the convolution feature extraction is performed on the first convolution feature map again through the convolution operation, so as to obtain a second convolution feature map, enhancement processing is performed on the second convolution feature map through a downsampling mode, and enhancement images obtained through different downsampling are fused, so that a fused enhancement feature map is obtained.

In this embodiment, the first convolution feature map is input to a 3×3 convolution layer, and the first convolution feature map is subjected to convolution processing to obtain a second convolution feature map. And performing first downsampling on the second convolution feature map to obtain a first enhancement feature map, performing second downsampling on the second convolution feature map to obtain a second enhancement feature map, and fusing the first enhancement feature map and the second enhancement feature map to obtain a fused enhancement feature map, wherein the first enhancement feature map is enhancement in a channel dimension, and the second enhancement feature map is enhancement in a space dimension.

Optionally, performing first downsampling on the second convolution feature map to obtain a first enhancement feature map, including:

respectively carrying out global average pooling and global maximum pooling on each channel characteristic in the second convolution characteristic diagram to obtain an average pooled characteristic value and a maximum pooled characteristic value corresponding to each channel, and fusing the average pooled characteristic value and the maximum pooled characteristic value corresponding to each channel to obtain a global fused characteristic vector corresponding to each channel;

feature extraction is carried out on the global fusion feature vector to obtain a feature extraction vector, and activation operation is carried out on the feature extraction vector to obtain a channel weight vector;

and calculating to obtain a first enhancement feature map according to the channel weight vector and the second convolution feature map.

In this embodiment, a global average pooling operation and a global maximum pooling operation are performed on a second convolution feature map with dimensions of h×w×c in channel dimensions to obtain an average pooled feature value and a maximum pooled feature value with dimensions of 1×1×c, the average pooled feature value and the maximum pooled feature value are fused to obtain global fused feature vectors corresponding to each channel, feature extraction is performed on the global fused feature vectors, and feature extraction is performed by using a multi-layer perceptual network (MLP), wherein the multi-layer perceptual network consists of an input layer, two hidden layers and an output layer, and the hidden layers are used for giving weight values and bias items to the input vectors. After feature extraction and channel dimension reduction are carried out through a multi-layer perception network, a feature extraction vector is obtained, the feature extraction vector is activated to obtain a channel weight vector, the channel weight vector is multiplied by a second convolution feature map, and a first enhancement feature map is obtained through calculation, wherein the calculation formula of the channel weight vector is as follows:

Wherein, the liquid crystal display device comprises a liquid crystal display device,for the channel weight vector, ++>Multilayer perceptive network->Global maximumLarge value pooling operation function,/->Global average pooling operation function,>is a second convolution signature.

The calculation formula of the first enhancement feature map is as follows:wherein (1)>As a first enhancement feature map,for the channel weight vector, ++>Is a second convolution signature.

Optionally, performing a second downsampling on the second convolution feature map to obtain a second enhancement feature map, including:

performing convolution operation and activation operation on the first enhancement feature map in the space dimension to obtain a space weight matrix;

and calculating to obtain a second enhancement feature map according to the space weight matrix and the second convolution feature map.

In this embodiment, the first enhancement feature map is subjected to a convolution operation and an activation operation in a spatial dimension, where the convolution function of the convolution operation is a convolution function of 1×1. The activation function in the activation operation isThe function is activated. The spatial weight matrix calculation formula is as follows: />

Wherein, the liquid crystal display device comprises a liquid crystal display device,in the form of a spatial weight matrix,/>for convolution function +.>To activate the function +.>Activating a function for sigmoid->Is a first enhancement profile.

The calculation formula of the second enhancement feature map is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device, For the second enhancement profile,/>Is a space weight matrix->Is a second convolution signature.

S203: and extracting a high-frequency characteristic diagram and a low-frequency characteristic diagram in the first convolution characteristic diagram, and calculating the association degree between the high-frequency characteristic diagram and the low-frequency characteristic diagram to obtain an association degree matrix.

In step S203, a high-frequency feature map and a low-frequency feature map in the first convolution feature map are obtained by performing frequency decomposition on the first convolution feature map, where the high-frequency feature is a defect feature, the low-frequency feature is a background feature, the high-frequency feature map is a feature map including the defect feature, and the low-frequency feature map is a feature map including the background feature.

In this embodiment, the first convolution feature map is subjected to frequency decomposition to obtain a high-frequency feature map and a low-frequency feature map, and the association degree between the high-frequency feature map and the low-frequency feature map is calculated according to the high-frequency feature map of the low-frequency feature map to obtain an association degree matrix.

Optionally, extracting the high-frequency feature map and the low-frequency feature map in the first convolution feature map includes:

filtering the first convolution feature map by using a preset high-pass filter to obtain a high-frequency feature map of the first convolution feature map;

and filtering the first convolution feature map by using a preset low-pass filter to obtain a low-frequency feature map of the first convolution feature map.

In this embodiment, frequency decomposition is performed on the first convolution feature map, and corresponding high-frequency signals and low-frequency signals are extracted, so as to obtain corresponding high-frequency feature map and low-frequency feature map, where the high-frequency signals and low-frequency signals in the extracted first convolution feature map are expressed as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,for the first convolution feature map,/>For low frequency signals, representing background features, +.>For the high-frequency signal, the defect characteristic is represented, and the product in the formula is inseparable, so the logarithm is taken from the two sides of the formula:

the frequency domain expression is:

wherein, the liquid crystal display device comprises a liquid crystal display device,representing a first convolution characteristic map->The frequency domain representation after fourier transformation is performed,for each point in the feature map, +.>For low-frequency signals->Frequency domain representation after fourier transformation, +.>For high-frequency signals->And carrying out frequency domain expression results after Fourier transformation.

The two sides of the formula are processed by using a filtering function to obtain:

wherein, the liquid crystal display device comprises a liquid crystal display device,in order to filter the transfer function, this allows the high frequency signal to be separated from the low frequency signal.

And then carrying out inverse Fourier transform on two sides of the formula to obtain the product:

wherein, the liquid crystal display device comprises a liquid crystal display device,representation->Result of inverse fourier transform,/>Representation ofResult of inverse fourier transform,/>Representation->And (3) performing an inverse fourier transform.

Taking indexes from two sides simultaneously to obtain:

the filtered image is used as a filter transfer function, so it can be seen from the above that the factor determining the overall filtering effect is entirely due to the selection of the filter, i.e. the filter transfer function +.>Is selected from the group consisting of (a) and (b).

In this embodiment, a preset high-pass filter is used to perform filtering processing on the first convolution feature map to obtain a high-frequency feature map of the first convolution feature map, and a preset low-pass filter is used to perform filtering processing on the first convolution feature map to obtain a low-frequency feature map of the first convolution feature map.

The calculation formulas of the high-frequency characteristic diagram and the low-frequency characteristic diagram are as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,is a low-frequency characteristic diagram->For high frequency characteristic map, < >>For low-pass filtering transfer function, < >>Is a high pass filter transfer function;

wherein, the liquid crystal display device comprises a liquid crystal display device,

wherein, the liquid crystal display device comprises a liquid crystal display device,for->To the filter center->T is the control of the high-frequency stop band transformation ratio, ">The value range is [5,15]。

Optionally, calculating the association degree between the high-frequency feature map and the low-frequency feature map to obtain an association degree matrix, including:

performing convolution operation on the low-frequency characteristic map to obtain a low-frequency convolution characteristic map, performing global average pooling operation on the low-frequency convolution characteristic map to obtain a low-frequency pooling characteristic map, and performing projection conversion on the low-frequency pooling characteristic map to obtain a low-frequency projection characteristic map;

Performing projection conversion on the high-frequency characteristic map to obtain a high-frequency projection characteristic map, performing activation operation on the high-frequency projection characteristic map, and calculating to obtain an activated high-frequency characteristic matrix;

multiplying the low-frequency projection feature map with the activated high-frequency feature matrix, and calculating the corresponding association degree to obtain an association degree matrix.

In this embodiment, a convolution operation is performed on a low-frequency feature map to obtain a low-frequency convolution feature map, where the convolution kernel of the convolution operation is a convolution function with a size of 3×3, a global average pooling operation is performed on the low-frequency convolution feature map to obtain a low-frequency pooled feature map, and projection conversion is performed on the low-frequency pooled feature map to obtain a low-frequency projection feature map, where the projection function of the projection conversion isAnd converting the image into a low-frequency projection characteristic diagram with the size of 1 XC. Performing projection conversion on the high-frequency characteristic map to obtain a high-frequency projection characteristic map, wherein the projection function of projection conversion is +.>Converting a high-frequency characteristic diagram with the size of C multiplied by H multiplied by W into a high-frequency projection characteristic diagram with the size of C multiplied by HW, performing activation operation on the high-frequency projection characteristic diagram, and calculating to obtain an activated high-frequency characteristic matrix, wherein the activation function is a Swish activation function, and the calculation formula is as follows: / >

Wherein, the liquid crystal display device comprises a liquid crystal display device,for the activated high-frequency feature matrix, +.>For the high-frequency projection characteristic diagram, the derivative of the Swish activation function is constantly larger than 0, and when the input is a value near 0, the output has certain smoothness, so that the optimization and generalization in the training process are facilitated.

Multiplying the low-frequency projection feature map with the activated high-frequency feature matrix, and calculating the corresponding association degree to obtain an association degree matrix. The calculation formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,for the association degree matrix, ++>For low-frequency projection feature map, < >>Is the activated high-frequency characteristic matrix.

S204: and performing activation treatment on the association degree matrix to obtain a weight activation matrix, using the weight activation matrix to enhance the fusion enhancement feature map to obtain an enhancement feature map, fusing the enhancement feature map with the first convolution feature map to obtain a fusion feature map, and detecting the fusion feature map to obtain a detection result.

In step S204, the fused enhancement feature map is enhanced by using the association degree matrix, so as to obtain an enhancement feature map, the enhancement feature map is fused with the first convolution feature map, so as to obtain a fused feature map, and defect detection is performed on the fused feature map, so as to obtain a detection result, wherein the association degree between the low-frequency signal and the high-frequency signal in the feature map can be modeled by using the association degree matrix, so as to obtain association degrees between different frequency signals, and further increase the difference between the low-frequency signal and the high-frequency signal.

In this embodiment, the association degree matrix is activated to obtain a weight activation matrix, where the activation function is a sigmoid activation function, and the calculation formula is:

wherein, the liquid crystal display device comprises a liquid crystal display device,activating matrix for weight->Is the association matrix.

The method comprises the steps of using a weight activation matrix to strengthen a fusion enhancement feature map to obtain the enhancement feature map, carrying out projection conversion on the weight activation matrix before enhancement to obtain a frequency difference weight matrix, carrying out dot multiplication on the frequency difference weight matrix and the fusion enhancement feature map to obtain the enhancement feature map, and carrying out addition fusion on the enhancement feature map and a first convolution feature map to obtain the fusion feature map, wherein the calculation formula is as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,for fusing featuresFigure (S)>For the first convolution feature map,/>For the frequency difference weight matrix, +.>To fuse enhanced feature maps.

And detecting the fusion characteristic diagram to obtain a detection result, wherein during detection, a U-net network is used for detection. The U-net network is a U-shaped network whose network structure comprises two main parts, namely a front-end encoder path and a back-end decoder path, the encoder part being composed of 5 convolutional layers, each convolutional layer having two convolutional kernels of size 3 x 3. Between each 3 x 3 convolution kernel is added the fusion feature extraction module of the method of the present invention, BN (Batch normalization) layer, linear correction unit layer (ReLU) and a 2 x 2 max pooling operation for downsampling. Correspondingly, the encoder path mainly comprises 5 continuous feature extraction blocks consisting of a convolution layer, a ReLu activation layer and a pooling layer, 2 times of up-sampling operation is carried out between every two feature extraction blocks by using 2X 2 convolution, so that the restoration of a feature map is realized, more detail information can be obtained, and meanwhile, advanced semantic features and shallow features are fused by using a jump link mode, so that more information is reserved. Using Softmax as a classification layer in the feature map output by the feature extraction block of the last layer of the decoder path, each component feature vector is mapped to the number of classes to be detected by using a 1×1 convolution, and pixel-by-pixel classification is achieved.

Before the detection is performed by using the U-net network, training is performed on the U-net network, and during training, a high-definition industrial camera is used for collecting quartz glass images to be detected, wherein the quartz glass images comprise multiple types of quartz glass images with different specifications and different defect types. For each image, the location information of the defect is manually noted for subsequent defect detection and quality assessment. The data set comprises 3000 quartz glass images, various defect targets such as scratches, color spots, pits, bubbles and pollution are included, the resolution of the images is 512 multiplied by 512, 2000 images are randomly selected as a training set, and 1000 images are selected as a test set. The invention uses Adam optimizer to train the model under the Pytorch framework. Two NVIDIA Quadro M5000 Graphics Processing Units (GPUs) are used with the Windows 10 operating system. The initial training parameters of the network are shown in table 1:

TABLE 1

The invention adopts the Overall Accuracy (OA), the average cross-over ratio (mIoU) and the average F1 score (AF) as evaluation indexes, trains and tests on a quartz glass data set, and the detection accuracy is shown in a table 2 and is compared with the current advanced semantic segmentation model. The method provided by the invention remarkably improves the detection accuracy of the defects of the quartz glass, and the detection effect of the defects of the quartz glass is shown in a table 2:

TABLE 2

According to the comparison of the defect detection accuracy in each defect detection model in table 2, it can be known that the detection accuracy of the defect detection model in the method of the application is higher than the detection accuracy of the detection method in the prior art in terms of overall accuracy, average cross-over ratio and average F1 score, so that the detection method of quartz glass in the application has remarkable beneficial effects.

Optionally, the fusion enhancement feature map is enhanced by using a weight activation matrix, so as to obtain an enhancement feature map, which includes:

performing projection conversion on the weight activation matrix to obtain a frequency difference weight matrix;

and carrying out weighted enhancement on the fusion enhancement feature map by using the frequency difference weight matrix to obtain the enhancement feature map.

In this embodiment, the weight activation matrix is subjected to projective transformation to obtain a frequency difference weight matrix, where the projectionsThe function isAnd converting the characteristic map into a frequency difference weight matrix between a low-frequency signal and a high-frequency signal with the size of 1 XH XW, carrying out weighted enhancement on the fusion enhancement characteristic map by using the frequency difference weight matrix to obtain an enhancement characteristic map, and carrying out dot multiplication on the frequency difference weight matrix and the fusion enhancement characteristic map during weighted enhancement.

It should be noted that, the quartz glass detection method based on the frequency spectrum provided by the embodiment of the application can be applied to various computer intelligent processing tasks, including but not limited to intelligent processing tasks of various application scenes such as target detection, target classification and the like, and is not particularly limited.

The above-mentioned intelligent processing task is specifically applicable to image analysis, detection and object classification scenarios including but not limited to micro-scale and nano-scale of semiconductors, for example, and is specifically applicable to semiconductor detection, and the embodiments of the present application are not limited thereto. For example, if the method is applied to the target detection application scene, the target detection processing can be performed by utilizing the output image characteristics finally obtained by the quartz glass detection method based on the frequency spectrum provided by the application, so as to obtain the target detection result.

Referring to fig. 3, a flow chart of a method for detecting quartz glass based on frequency spectrum according to a second embodiment of the present application is shown, a quartz glass image to be detected is obtained, and a convolution layer 1 is used to perform convolution feature extraction on the quartz glass image to be detected, so as to obtain a first convolution feature mapFirst convolution feature map ++through convolution layer 2>Performing convolution feature extraction to obtain a second convolution feature map +.>Global average pooling and maximum pooling are carried out on the second convolution feature map, and the processed results are matchedAdding fusion to obtain a global fusion feature vector, and performing feature extraction and activation by using a multi-layer perceptive network (MLP) to obtain a channel weight vector +. >Channel weight vector +.>With a second convolution feature map->Multiplying to obtain a first enhancement feature map +.>Performing convolution operation and activation operation on the first enhancement feature map in the space dimension to obtain a space weight matrix +.>Spatial weight matrix +.>With a second convolution feature map->Multiplying to obtain a second enhancement profile +.>First enhancement feature map ++>And second enhancement feature map->Fusion is carried out to obtain a fusion enhancement characteristic diagram +.>。

Extracting a first convolution feature mapHigh-frequency characteristic diagram->And low frequency characteristic diagram->For low frequency characteristic diagram->Global average pooling operation is carried out to obtain low-frequency pooling characteristic diagram +.>For high frequency characteristic diagram->Performing dimension conversion to obtain a high-frequency projection characteristic diagram +.>Activating the high-frequency projection feature map by using a Swish activation function to obtain an activated high-frequency feature matrix ++>According to the activated high frequency characteristic matrix +.>Low frequency pooling feature map->Calculating a correlation matrix of the high-frequency characteristic diagram and the low-frequency characteristic diagram>Correlation matrix using sigmoid activation function>Performing activation to obtain weight activation matrix +.>Activating matrix of weight value->And fusion enhancement feature map->Fusing to obtain enhanced feature map Will enhance the feature map->Fusion with the first convolution profile>Obtaining a fusion characteristic diagram->Fusion profile based->And (5) detecting.

Referring to fig. 4, a comparison chart of detection effects of a spectrum-based quartz glass detection method and other defect detection models on an original image obtained by a high-definition industrial camera is provided in a third embodiment of the present application. The original image is a quartz glass image to be detected, and as can be seen from fig. 4, when the quartz glass to be detected is detected by using the deep V3-Plus model, 3 detection results are obtained, when the quartz glass to be detected is detected by using the HRNet model, 3 detection results are obtained, when the quartz glass to be detected is detected by using the HMANet model, 5 detection results are obtained, and when the quartz glass to be detected is detected by using the method model, 8 detection results are obtained, and according to a comparison result diagram, the method has remarkable beneficial effects.

Obtaining a quartz glass image to be detected, carrying out convolution feature extraction on the quartz glass image to be detected to obtain a first convolution feature image, carrying out convolution feature extraction on the first convolution feature image to obtain a second convolution feature image, carrying out first downsampling on the second convolution feature image to obtain a first enhancement feature image, carrying out second downsampling on the first enhancement feature image to obtain a second enhancement feature image, fusing the first enhancement feature image with the second enhancement feature image to obtain a fused enhancement feature image, extracting a characteristic high-frequency feature image and a characteristic low-frequency feature image in the first convolution feature image, calculating a correlation degree between the high-frequency feature image and the low-frequency feature image to obtain a correlation degree matrix, carrying out activation processing on the correlation degree matrix to obtain a weight activation matrix, enhancing the fused enhancement feature image by using the weight activation matrix to obtain an enhancement feature image, fusing the enhancement feature image with the first convolution feature image to obtain a fused feature image, and detecting the fused feature image to obtain a detection result. According to the correlation degree of different spectrum signal characteristic graphs in the quartz glass image, the difference of the different spectrum signal characteristic graphs is enhanced, and the accuracy of quartz glass detection is further improved.

Fig. 5 shows a block diagram of a spectrum-based quartz glass detection apparatus according to a fourth embodiment of the present application, which corresponds to the spectrum-based quartz glass detection method of the above embodiment, and the above quartz glass detection apparatus is applied to the above server. For convenience of explanation, only portions relevant to the embodiments of the present application are shown. Referring to fig. 5, the quartz glass detection device 50 includes: the device comprises an acquisition module 51, an enhancement module 52, an extraction module 53 and a detection module 54.

The obtaining module 51 is configured to obtain a quartz glass image to be detected, and perform convolution feature extraction on the quartz glass image to be detected to obtain a first convolution feature map.

The enhancement module 52 is configured to perform convolution feature extraction on the first convolution feature map to obtain a second convolution feature map, perform first downsampling on the second convolution feature map to obtain a first enhancement feature map, perform second downsampling on the second convolution feature map to obtain a second enhancement feature map, and fuse the first enhancement feature map with the second enhancement feature map to obtain a fused enhancement feature map;

the extracting module 53 is configured to extract the high-frequency feature map and the low-frequency feature map in the first convolution feature map, calculate a degree of association between the high-frequency feature map and the low-frequency feature map, and obtain a degree of association matrix.

The detection module 54 is configured to perform activation processing on the association degree matrix to obtain a weight activation matrix, enhance the fusion enhancement feature map by using the weight activation matrix to obtain an enhancement feature map, fuse the enhancement feature map with the first convolution feature map to obtain a fusion feature map, and detect the fusion feature map to obtain a detection result.

Optionally, the enhancement module 52 includes:

and the channel characteristic unit is used for respectively carrying out global average pooling and global maximum pooling on each channel characteristic in the second convolution characteristic diagram to obtain a mean pooled characteristic value and a maximum pooled characteristic value corresponding to each channel, and fusing the mean pooled characteristic value and the maximum pooled characteristic value corresponding to each channel to obtain a global fused characteristic vector corresponding to each channel.

And the activating unit is used for carrying out feature extraction on the global fusion feature vector to obtain a feature extraction vector, and carrying out activating operation on the feature extraction vector to obtain a channel weight vector.

The first calculation unit is used for calculating a first enhancement feature map according to the channel weight vector and the second convolution feature map.

Optionally, the enhancement module 52 includes:

the space weight obtaining unit is used for carrying out convolution operation and activation operation on the first enhancement feature map in the space dimension to obtain a space weight matrix;

The first calculation unit is used for calculating a second enhancement feature map according to the space weight matrix and the second convolution feature map.

Optionally, the extracting module 53 includes:

and the first filtering unit is used for filtering the first convolution characteristic map by using a preset high-pass filter to obtain a high-frequency characteristic map of the first convolution characteristic map.

And the second filtering unit is used for filtering the first convolution characteristic map by using a preset low-pass filter to obtain a low-frequency characteristic map of the first convolution characteristic map.

Optionally, the extracting module 53 includes:

the first projection unit is used for carrying out convolution operation on the low-frequency characteristic map to obtain a low-frequency convolution characteristic map, carrying out global average pooling operation on the low-frequency convolution characteristic map to obtain a low-frequency pooling characteristic map, and carrying out projection conversion on the low-frequency pooling characteristic map to obtain a low-frequency projection characteristic map.

The activating unit is used for carrying out projection conversion on the high-frequency characteristic diagram to obtain a high-frequency projection characteristic diagram, carrying out activating operation on the high-frequency projection characteristic diagram, and calculating to obtain an activated high-frequency characteristic matrix.

And the association degree calculating unit is used for multiplying the low-frequency projection feature map with the activated high-frequency feature matrix and calculating the corresponding association degree to obtain an association degree matrix.

Optionally, the detection module 54 includes:

and the second projection unit is used for performing projection conversion on the weight activation matrix to obtain a frequency difference weight matrix.

And the weighting unit is used for weighting and enhancing the fusion enhancement feature map by using the frequency difference weight matrix to obtain the enhancement feature map.

It should be noted that, because the content of information interaction and execution process between the modules and the embodiment of the method of the present application are based on the same concept, specific functions and technical effects thereof may be referred to in the method embodiment section, and details thereof are not repeated herein.

Fig. 6 is a schematic structural diagram of a terminal device according to a fifth embodiment of the present application. As shown in fig. 6, the terminal device of this embodiment includes: at least one processor (only one shown in fig. 6), a memory, and a computer program stored in the memory and executable on the at least one processor, the processor executing the computer program to perform the steps of any of the various spectrum-based quartz glass detection method embodiments described above.

The terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that fig. 6 is merely an example of a terminal device and is not limiting of the terminal device, and that the terminal device may comprise more or less components than shown, or may combine some components, or different components, e.g. may further comprise a network interface, a display screen, input means, etc.

The processor may be a CPU, but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory includes a readable storage medium, an internal memory, etc., where the internal memory may be a memory of the terminal device, and the internal memory provides an environment for the operation of an operating system and computer readable instructions in the readable storage medium. The readable storage medium may be a hard disk of the terminal device, and in other embodiments may be an external storage device of the terminal device, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), etc. that are provided on the terminal device. Further, the memory may also include both an internal storage unit of the terminal device and an external storage device. The memory is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs such as program codes of computer programs, and the like. The memory may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above-described embodiment, and may be implemented by a computer program to instruct related hardware, and the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of the method embodiment described above. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, executable files or in some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code, a recording medium, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The present application may also be implemented by a computer program product for implementing all or part of the steps of the method embodiments described above, when the computer program product is run on a terminal device, causing the terminal device to execute the steps of the method embodiments described above.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A method for detecting quartz glass based on frequency spectrum, characterized in that the method comprises:

Performing convolution feature extraction on the first convolution feature map to obtain a second convolution feature map, performing first downsampling on the second convolution feature map to obtain a first enhancement feature map, performing second downsampling on the second convolution feature map to obtain a second enhancement feature map, and fusing the first enhancement feature map and the second enhancement feature map to obtain a fused enhancement feature map;

2. The method of claim 1, wherein the first downsampling the second convolution signature to obtain a first enhancement signature comprises:

3. The method of claim 1, wherein said second downsampling the second convolution signature to obtain a second enhancement signature, comprising:

performing convolution operation and activation operation on the first enhancement feature map in a space dimension to obtain a space weight matrix;

4. The quartz glass detection method of claim 1, wherein the extracting the high frequency signature and the low frequency signature in the first convolution signature comprises:

5. The quartz glass detection method of claim 1, wherein the calculating the degree of association between the high frequency feature map and the low frequency feature map to obtain the degree of association matrix comprises:

6. The method for detecting quartz glass according to claim 1, wherein the step of enhancing the fusion enhanced feature map using the weight activation matrix to obtain an enhanced feature map comprises:

performing projection conversion on the weight activating matrix to obtain a frequency difference weight matrix;

and carrying out weighted enhancement on the fusion enhancement feature map by using the frequency difference weight matrix to obtain an enhancement feature map.

7. A quartz glass detection device based on frequency spectrum is characterized in that

Comprising the following steps:

8. The quartz glass detection apparatus of claim 7, wherein the reinforcement module comprises:

the channel feature unit is used for respectively carrying out global average pooling and global maximum pooling on each channel feature in the second convolution feature map to obtain an average pooled feature value and a maximum pooled feature value corresponding to each channel, and fusing the average pooled feature value and the maximum pooled feature value corresponding to each channel to obtain a global fused feature vector corresponding to each channel;

the activating unit is used for carrying out feature extraction on the global fusion feature vector to obtain a feature extraction vector, and carrying out activating operation on the feature extraction vector to obtain a channel weight vector;

and the calculating unit is used for calculating to obtain a first enhancement feature map according to the channel weight vector and the second convolution feature map.

9. A terminal device, characterized in that it comprises a processor, a memory and a computer program stored in the memory and executable on the processor, which processor, when executing the computer program, implements the quartz glass detection method according to any of claims 1 to 6.

10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the quartz glass detection method according to any of claims 1 to 6.