CN116645567A - Unsupervised anomaly detection method based on pixel single-point structure and multi-element pairing logic - Google Patents
Unsupervised anomaly detection method based on pixel single-point structure and multi-element pairing logic Download PDFInfo
- Publication number
- CN116645567A CN116645567A CN202310570510.1A CN202310570510A CN116645567A CN 116645567 A CN116645567 A CN 116645567A CN 202310570510 A CN202310570510 A CN 202310570510A CN 116645567 A CN116645567 A CN 116645567A
- Authority
- CN
- China
- Prior art keywords
- logic
- pixel
- feature
- structural
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 33
- 238000000605 extraction Methods 0.000 claims abstract description 53
- 238000012360 testing method Methods 0.000 claims abstract description 41
- 230000002159 abnormal effect Effects 0.000 claims abstract description 36
- 238000000034 method Methods 0.000 claims abstract description 24
- 238000012549 training Methods 0.000 claims abstract description 22
- 230000005856 abnormality Effects 0.000 claims abstract description 21
- 230000007547 defect Effects 0.000 claims abstract description 14
- 238000012545 processing Methods 0.000 claims description 4
- 238000007500 overflow downdraw method Methods 0.000 claims 1
- 238000005070 sampling Methods 0.000 description 6
- 230000004927 fusion Effects 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 3
- 238000007689 inspection Methods 0.000 description 3
- 238000004806 packaging method and process Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000009776 industrial production Methods 0.000 description 2
- 238000012856 packing Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 101100494773 Caenorhabditis elegans ctl-2 gene Proteins 0.000 description 1
- 101100112369 Fasciola hepatica Cat-1 gene Proteins 0.000 description 1
- 101100005271 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) cat-1 gene Proteins 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 235000015203 fruit juice Nutrition 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/7715—Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to the technical field of industrial image defect detection, and discloses an unsupervised anomaly detection method based on pixel single-point and multi-element pairing. The method comprises the following steps: s1, constructing a structural feature extraction branch network and a logic feature extraction branch network; s2, respectively inputting the images for training into the structural feature extraction branch network and the logic feature extraction branch network to form a structural feature memory bank and a logic feature memory bank; s3, obtaining test pixel characteristics and test logic characteristics corresponding to the image to be tested, and obtaining a structural abnormality score map and a logic abnormality score map of the image to be tested; and S4, fusing the structural anomaly score map and the logic anomaly score map to obtain a total anomaly score map of the image to be tested, and determining the position of the defect according to the anomaly score map. The invention solves the problem of abnormal surface structure and position logic in the product image.
Description
Technical Field
The invention belongs to the technical field related to industrial image defect detection, and particularly relates to an unsupervised anomaly detection method based on pixel single-point and multi-element pairing.
Background
In the actual industrial production and manufacturing process, various unknown conditions such as machine faults, transportation breakage, misoperation of workers and the like often exist, so that the industrial products are unqualified. Besides the quality defect of the surface of the industrial product, the quality defect of the industrial product can also cause the defects of incorrect placement of the product, missing of product packaging, abnormal plate production and the like. To increase the efficiency of production delivery, quality inspection methods have also been shifted from traditional manual inspection to vision-based automated inspection, with anomaly detection being a representative task. In the production process from production and packaging to delivery of products, not only the problem of abnormal structure of the surface quality of the products per se can occur, but also the problem of abnormal logic such as incorrect placement of the products, non-correspondence between the packaging and the products and the like can occur, such as missing quantity or incorrect assortment of the packaged screws, inconsistent filling of juice in glass bottles or incorrect position of product labels and the like. Because the abnormal situation is complex and difficult to predict, the method for detecting the structural abnormality by using the surface quality of the product is insufficient, and the existing method for detecting the structural abnormality is researched relatively mature, and is difficult to cope with the scene of detecting the logical abnormality of the product, the detection problem of the abnormal sample with the logical relation is the detection problem of the combined abnormality of the more complex structure and the logic, namely: not only surface structural anomalies of the product image in the sample, but also potential positional logic anomalies in the sample image are detected.
Therefore, the combined detection model of the structure and the logic abnormality can effectively utilize the image structure and the image pixel pair logic relation information of the abnormality sample, further improve the performance of the abnormality detection model and is important for the quality detection of industrial production.
Disclosure of Invention
Aiming at the defects or improvement demands of the prior art, the invention provides an unsupervised anomaly detection method based on pixel single-point and multi-element pairing, which solves the problem of anomaly in surface structure and position logic in a product image.
In order to achieve the above object, according to the present invention, there is provided an unsupervised anomaly detection method based on pixel single-point and multi-element pairing, the method comprising the steps of:
s1, constructing a structural feature extraction branch network for extracting pixel features of an image and a logic feature extraction branch network for extracting logic features;
s2, respectively inputting the images for training into the structural feature extraction branch network and the logic feature extraction branch network, so as to extract and obtain pixel features and logic features of each image, wherein the pixel features of all the images form a structural feature memory bank, and the logic features of all the images form a logic feature memory bank;
s3, inputting the to-be-tested image to the structural feature extraction branch network and the editing feature extraction branch network respectively to obtain a test pixel feature and a test logic feature corresponding to the to-be-tested image, and calculating a maximum distance score of a distance between the test pixel feature and the pixel feature in the structural feature memory bank to obtain an abnormal score of the to-be-tested image so as to obtain a structural abnormal score map of the to-be-tested image; calculating the global consistency between the test logic features and the logic features in the logic feature memory library to obtain the abnormal score of the image to be tested, so as to obtain a logic abnormal score graph of the image to be tested;
s4, fusing the structural abnormal score map and the logic abnormal score map to obtain a total abnormal score map of the image to be tested, comparing the abnormal scores of all positions in the abnormal score map with a preset threshold value, wherein the positions larger than the preset threshold value are positions where defects are located, otherwise, determining the positions where the defects of the image to be tested are located.
Further preferably, in step S1, the structural feature extraction branch network and the logical feature extraction branch network each employ a Wide res net50 network pre-trained on an ImageNet dataset.
Further preferably, in step S2, the logic feature extraction branch network forms a logic feature memory bank according to the following steps:
s21, extracting one or more groups of pixel feature pairs in each feature layer from the logic feature extraction branch network;
s22, connecting the pixel characteristic pairs in each characteristic layer according to the number of preset pixel pairs to form multiple pixel pairs, so as to obtain multiple pixel pairs in all the characteristic layers;
s23, aligning the multiple pixels of different feature layers, and forming a logic feature memory bank by all the aligned multiple pixels.
Further preferably, in step S3, the alignment employs a method of multi-scale feature fusion.
Further preferably, in step S3, the anomaly score in the structural anomaly score map is calculated according to the following relation:
wherein ,is the structural abnormality score, m test,* Is a test pixel feature, m * Is equal to m test,* Training pixel structural features with maximum similarity.
Further preferably, in step S3, the anomaly score in the logical anomaly score map is calculated according to the following relation:
wherein ,is a logical anomaly score, ++>Is a test logic feature, +.>Is in combination with->Training pixel logic features with maximum similarity. Further preferably, in step S4, the total anomaly score map of the image to be tested is calculated according to the following relation:
wherein ,is a structural abnormality score, & lt + & gt>Is a logical anomaly score.
Further preferably, in step S2, after the structural feature memory bank and the logic feature memory bank are formed, data processing is further required to be performed on the elements in each memory bank, so as to reject unqualified data.
Further preferred, the data processing employs a subsampling method of a greedy strategy.
In general, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:
1. the method simultaneously considers the implicit logic information between the image pixel single points and the image pixel pairs so as to solve the practical application problems that whether the combination and collocation of the number of workpieces in the automatic assembly line detection of industrial products meet the requirements, whether the type or the position of the product labels meet the specifications and the like need to be considered simultaneously, namely, the structure and the logic defects are considered simultaneously;
2. the method provided by the invention realizes the joint detection of the unsupervised structure and logic abnormality/defect, namely the abnormal joint detection problem of the local structure and the global logic in the sample to be tested in the test stage can be realized by only utilizing the normal sample information in the training stage.
Drawings
FIG. 1 is a schematic diagram of a training phase constructed in accordance with a preferred embodiment of the present invention;
FIG. 2 is a schematic diagram of a multi-pixel pair construction constructed in accordance with a preferred embodiment of the invention;
fig. 3 is a schematic diagram of a test phase constructed in accordance with a preferred embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Aiming at the defect that the existing anomaly detection method can only detect structural anomalies of a workpiece to be detected but cannot detect higher-level logic anomalies, an unsupervised anomaly detection method based on pixel-to-pixel logic relationship information is provided, and aims to model potential logic or geometric relationships in an anomaly-free sample by utilizing single pixel information and paired pixel relationship information at the same time, so that comparison and distance calculation are carried out between the potential logic or geometric relationships and the sample to be detected, and the structure and logic anomalies in the sample to be detected are identified.
Furthermore, the model network structure of the unsupervised anomaly detection method based on the pixel-to-logic relationship information mainly comprises a structural feature extraction branch network and a logic feature extraction branch network. The structural feature extraction branch network is a network model pre-trained on a large natural image data set, and the extracted features have high discriminant. The invention further carries out kernel set dimension reduction sampling on the characteristics, and can remove redundant information and reduce reasoning time. There is a certain logic relationship between pixels at different positions in the image, for example, there is a corresponding matching relationship between the packing number of the product and the grids (for example, whether there is a pin in each grid of the packing box), the product label and the product itself (for example, the juice label and the juice color). The logic feature extraction branch network is used for extracting logic features from logic relation information among a plurality of pixel features in an image, the extracted features can consider long-distance binary and multi-element logic relations among the pixels of the image, and logic anomalies (such as cable terminal position errors and mismatching of icons and fruit juice types) in the image can be distinguished through modeling the logic relation features. Therefore, the structural feature extraction branch network and the logic feature extraction branch network in the model network structure provided by the invention can enable the model to carry out joint detection on the structure and the logic abnormal condition of the sample to be detected. In summary, the present invention considers both structural anomalies and logical anomalies. Fitting and learning normal data feature distribution through two branches, wherein: the structural feature extraction network has better judging capability for structural anomalies, and the logic feature extraction network has better judging capability for logic anomalies.
When the structural feature extraction branch network is used for extracting image features, the pre-training model is directly used for extraction. Before the feature is extracted by the logic feature extraction branch network, the pixels are distributed independently, i.e. different pixels have no fixed relation, and therefore, the connection between the pixel features needs to be established manually. Therefore, the method adopts a mode of combining the long-short distance multi-scale pixel characteristic blocks, the spliced pixel pairs are sent to a network model to perform relation characteristic extraction so as to model the position logic relation in the image, and the extracted logic characteristics are sent to a characteristic memory bank.
After the structural feature extraction branch and the logic feature extraction branch are used for extracting features respectively, the structural feature extraction branch and the logic feature extraction branch can be used for detecting structural abnormality and logic abnormality in a combined mode respectively; and in order to obtain the final result of the joint detection, the abnormal score maps output by the two branch networks are added and fused to obtain the final abnormal score map.
Specifically:
as shown in fig. 1, the above method includes two feature extraction branch networks: structural feature extraction branch network N con Logical feature extraction branch network N log . Wherein, the structural feature extracts the branch network N con For detecting structural anomalies, a logical feature extraction branch network N log For detecting logical anomalies.
The invention includes a training phase and a testing phase, wherein the training phase is as shown in fig. 1, and comprises:
(1) Training stage of submodule aiming at abnormal structure:
in normal industrial product image I n Extraction of a branched network N for input by means of structural features con Extracting pixel block structure characteristic information of normal data, and obtaining normal pixel characteristic F con Aggregate to a memory bank M con Is a kind of medium. Then, setting l feature subsets by using a subsampling method based on greedy strategy, using the feature subsets as much as possible to represent most data features, selecting a point in one feature subset at a time, finding a farthest point in the subset to sample and store, namely, finding a farthest point in part, namely, an optimal solution, so as to realize the purpose of eliminating redundant features, and finally obtaining the structural feature memory M for detecting structural anomalies con 。
(2) Sub-module training phase for logic anomalies:
in normal industrial product image I n Extraction of a branched network N for input by means of structural features con Extracting pixel characteristics F of normal data in different characteristic layers con The following are provided:
F con =N con (I n ) (I)
further, 1) we use logical bitsSign extraction branch network N log Extracting one or more groups of pixel feature pairs at different positions in each feature layer, 2) forming a multi-element pixel pair F by connection (Cat) Cat-K As shown in formula (2)
F Cat·K =(F log·1 ,F log·2 ,……,F log·k ) (2)
Multiple pixel pairs F Cat·K Consists of a plurality of groups of pixel pairs, and K represents the number of the pixel pairs. As shown in fig. 2, here, 4 sets of pixel pairs are illustrated: logical feature pixel pairs F at positions near the top (top) and bottom (bottom) log·1 ,F log·2 The specific definition is shown in formula (3)
There are also pairs of logical feature pixels F in the middle (middle) and near the ends (top, bottom) log·3 ,F log·4 The specific definition is shown in formula (4)
Nearby local structure pixel features that may be left, right, or up, down (up, down) adjacent, e.g.Pixel characteristic representing top center position, +.>Representing the bottom right-hand pixel feature.
The minimum included angle formed by the two pixel characteristic blocks of the pixel pair and the horizontal direction or the vertical direction is theta, the included angle theta is less than or equal to 45 degrees, and the pixel characteristic blocks can be randomly selected and paired in the included angle range. Combining and connecting (Cat) a plurality of normal pixel blocks with different long and short distances to form a multi-pixel pair (pixel pair)The number of the contained pixel blocks can be more than or equal to three, so that the method provided by the invention is conveniently expressed, and the reference sign is simplified, and K is used for representing the total pairing number of the multi-element pixel pairs and is used for modeling the pixel relationship in the image. Specifically, the distance is longer than half of the distance of the furthest pixel pair, and the distance is shorter than the distance, so that the logic characteristic multi-scale logic characteristic F under different scales is formed log ,
3) Through a Multi-scale feature fusion MFF (Multi-scale feature fusion) module, logic features formed by different feature layers are fused after being aligned by upper and lower layer sampling features, as shown in a formula (5) F Cat·K =MFF(F log·1 ,F log·2 ,……,F log·k ) (5)
Forming a multi-element feature pixel pair F Cat·K Then, it is sent to a memory bank M for normal image logic feature aggregation log Then, the sub-sampling method based on greedy strategy mentioned in the structural abnormality detection is adopted to remove redundant logic features, and finally, a logic feature memory library M for detecting logic abnormality is obtained log
Memory M of structural characteristics con And logic feature memory M log Together form a feature memory M bank The definition is as follows:
M bank =M con ∪M log (6)
the test stage is shown in fig. 3, and is specifically described as follows:
test image I test Input structural feature extraction branch network N con And logic feature extraction branch network N log The two networks respectively obtain the corresponding structural characteristics F con And logic feature F log . The structural feature extraction branch network stores the structural features of the pixel level of the normal image into the structural feature memory M in the training stage con By calculating the pixel characteristics of the test image during the testAnd M is as follows con Normal pixel feature F con The maximum distance score between the two is used for estimating the abnormality score of the test image, and the structural abnormality score graph S of the test image is obtained by calculating the abnormality score of each pixel loc . The logic feature extraction branch network links the normal image pixel pairs in the training stage and stores the logic features into the logic feature memory M log By calculating image pixel pairs characteristic F at test time log And M is as follows log Estimating the abnormal score of the test image by the global consistency among the normal pixel pairs, and obtaining a logic abnormal score graph S of the test image by calculating the consistency of each pixel pair log . Finally, the structure anomaly score map S con And logical anomaly score graph S log Fusing, and finally obtaining abnormal score S map As an abnormal score graph of the test image.
Further, in the training stage of the structural feature extraction branch network, the process of acquiring the structural feature is shown in the upper partial branch network in fig. 3. First the normal sample is decomposed into a set of pixel level featuresFor use herein
To represent normal training samples x i ∈X N C in the j-th layer of the pre-trained network phi * The dimension position h e 1, h * The sum w e 1, once again, w * The feature of the location is called a feature block m, i.e. a pixel feature block.
Training samples x on all normal samples i ∈X N Structural feature memory M con Simply defined as
To ensure the reasoning speed, the detection efficiency is improved, in thisWe use a core set sub-sampling mechanism to reduce the feature memory pool M, conceptually, the purpose of core set selection is to find a subsetSo that the problem solution on M is equal to M c The above solution is close.
Namely, a feature library obtained by sub-sampling a core set is used for the convenience of illustration, and M is used for the following feature library con and Mlog The representation is actually a feature library after sub-sampling by the core set.
By calculating test feature blocks m test,* And M is as follows con M of each nearest neighbor * Maximum distance score between to estimate test image x test Is a structure anomaly score graph of (1)
m test For each test feature block, m is the feature block found in the structural feature library.
In the logic anomaly training stage, the logic feature acquisition process is shown in the lower logic feature extraction branch network of fig. 3.
Normal sample decomposition into pixel-level feature setsPre-training network extracts normal pixel characteristics of product image at different positions, such as F top-left 、F middle-down Knot pair, F top-center 、F bottom-left Knot pair, F top-right 、F bottom-center Knot pair, F middle-up 、F bottom-right Junction pairs, wherein the included angle theta between the pixel characteristic and the horizontal or vertical angle theta is smaller than theta min On the premise of randomly forming k groups of pixel pairs, and using m for logical characteristics of the pixel pairs through a merging connection (Cat) Cat-k Representing to obtain the logic characteristic m of the dependency relationship of the long and short distance pixels with different scales Cat-1 ……m Cat-k Logical features after multi-scale feature fusion (MFF) are denoted as m Cat-K The definition is as follows
m Cat·K =MFF(m Cat·1 ,m Cat·2 ,……,m Cat·k ) (12)
Then store it in the logic feature memory M log . Where x is used i-pair Representing pairs of feature pixels in a same layer network, phi j-multi Representing a number of different j-th layers of the pre-trained network, MFF (phi) j·multi (x i·pair ) A) represents logical feature pixel pairs after multi-scale feature fusion.
Training samples x on all normal samples i ∈X N Logic characteristic memory M log Simply defined as
Feature memory M with logic description log We test feature m in the feature block set by computing it test Junction pair formed pixel pair logic featureAnd M is as follows log Each nearest neighbor pixel logic feature pixel m Cat Maximum distance score between to calculate a logic of the test imageEditing abnormal score map->
Finally obtained abnormal score map
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (9)
1. An unsupervised anomaly detection method based on pixel single-point and multi-element pairing is characterized by comprising the following steps:
s1, constructing a structural feature extraction branch network for extracting pixel features of an image and a logic feature extraction branch network for extracting logic features;
s2, respectively inputting the images for training into the structural feature extraction branch network and the logic feature extraction branch network, so as to extract and obtain pixel features and logic features of each image, wherein the pixel features of all the images form a structural feature memory bank, and the logic features of all the images form a logic feature memory bank;
s3, inputting the to-be-tested image to the structural feature extraction branch network and the editing feature extraction branch network respectively to obtain a test pixel feature and a test logic feature corresponding to the to-be-tested image, and calculating a maximum distance score of a distance between the test pixel feature and the pixel feature in the structural feature memory bank to obtain an abnormal score of the to-be-tested image so as to obtain a structural abnormal score map of the to-be-tested image; calculating the global consistency between the test logic features and the logic features in the logic feature memory library to obtain the abnormal score of the image to be tested, so as to obtain a logic abnormal score graph of the image to be tested;
s4, fusing the structural abnormal score map and the logic abnormal score map to obtain a total abnormal score map of the image to be tested, comparing the abnormal scores of all positions in the abnormal score map with a preset threshold value, wherein the positions larger than the preset threshold value are positions where defects are located, otherwise, determining the positions where the defects of the image to be tested are located.
2. The method for unsupervised anomaly detection based on pixel single-point and multi-component pairing of claim 1, wherein in step S1, the structural feature extraction branch network and the logical feature extraction branch network are both Wide res net50 networks pre-trained on ImageNet data sets.
3. The method for unsupervised anomaly detection based on pixel single-point and multi-component pairing according to claim 1 or 2, wherein in step S2, the logic feature extraction branch network forms a logic feature memory bank according to the following steps:
s21, extracting one or more groups of pixel feature pairs in each feature layer from the logic feature extraction branch network;
s22, connecting the pixel characteristic pairs in each characteristic layer according to the number of preset pixel pairs to form multiple pixel pairs, so as to obtain multiple pixel pairs in all the characteristic layers;
s23, aligning the multiple pixels of different feature layers, and forming a logic feature memory bank by all the aligned multiple pixels.
4. A method of unsupervised anomaly detection based on pixel single-point and multi-component pairing as claimed in claim 3, wherein in step S3, the alignment uses a multi-scale feature fusion method.
5. The method for unsupervised anomaly detection based on pixel single-point and multi-component pairing according to claim 1 or 2, wherein in step S3, the anomaly score in the structural anomaly score map is calculated according to the following relation:
wherein ,is the structural abnormality score, m test,* Is the structural feature of the test pixel, m * Is equal to m test,* Training pixel structural features with maximum similarity.
6. The method for unsupervised anomaly detection based on pixel single-point and multi-component pairing according to claim 1 or 2, wherein in step S3, the anomaly score in the logical anomaly score map is calculated according to the following relation:
wherein ,is a patrolEditing abnormal score, & lt>Is a test pixel logic feature,/->Is in combination with->Training pixel logic features with maximum similarity.
7. The method for unsupervised anomaly detection based on pixel single-point and multi-component pairing according to claim 1 or 2, wherein in step S4, the total anomaly score map of the image to be tested is calculated according to the following relation:
wherein ,is a structural abnormality score, & lt + & gt>Is a logical anomaly score.
8. The method for unsupervised anomaly detection based on pixel single-point and multi-component pairing according to claim 1 or 2, wherein in step S2, after the structural feature memory bank and the logic feature memory bank are formed, data processing is further required to be performed on the elements in each memory bank, so as to reject unqualified data.
9. An unsupervised anomaly detection method based on pixel single-point and multi-component pairing as in claim 7, wherein the data processing employs a subsampling method of greedy strategy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310570510.1A CN116645567A (en) | 2023-05-19 | 2023-05-19 | Unsupervised anomaly detection method based on pixel single-point structure and multi-element pairing logic |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310570510.1A CN116645567A (en) | 2023-05-19 | 2023-05-19 | Unsupervised anomaly detection method based on pixel single-point structure and multi-element pairing logic |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116645567A true CN116645567A (en) | 2023-08-25 |
Family
ID=87639194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310570510.1A Pending CN116645567A (en) | 2023-05-19 | 2023-05-19 | Unsupervised anomaly detection method based on pixel single-point structure and multi-element pairing logic |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116645567A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912284A (en) * | 2023-09-15 | 2023-10-20 | 电子科技大学中山学院 | Matting method, matting device, electronic equipment and computer readable storage medium |
CN118195995A (en) * | 2023-12-05 | 2024-06-14 | 钛玛科(北京)工业科技有限公司 | Image anomaly detection method and device |
-
2023
- 2023-05-19 CN CN202310570510.1A patent/CN116645567A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912284A (en) * | 2023-09-15 | 2023-10-20 | 电子科技大学中山学院 | Matting method, matting device, electronic equipment and computer readable storage medium |
CN118195995A (en) * | 2023-12-05 | 2024-06-14 | 钛玛科(北京)工业科技有限公司 | Image anomaly detection method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116645567A (en) | Unsupervised anomaly detection method based on pixel single-point structure and multi-element pairing logic | |
Yuan et al. | A deep convolutional neural network for detection of rail surface defect | |
Zheng et al. | A defect detection method for rail surface and fasteners based on deep convolutional neural network | |
CN109712127B (en) | Power transmission line fault detection method for machine inspection video stream | |
CN112906769A (en) | Power transmission and transformation equipment image defect sample amplification method based on cycleGAN | |
Prunella et al. | Deep learning for automatic vision-based recognition of industrial surface defects: a survey | |
CN113362313B (en) | Defect detection method and system based on self-supervised learning | |
US20230281784A1 (en) | Industrial Defect Recognition Method and System, Computing Device, and Storage Medium | |
CN117152484B (en) | Small target cloth flaw detection method based on improved YOLOv5s | |
Li et al. | A review of deep learning methods for pixel-level crack detection | |
CN113643268A (en) | Industrial product defect quality inspection method and device based on deep learning and storage medium | |
CN113096085A (en) | Container surface damage detection method based on two-stage convolutional neural network | |
CN115830004A (en) | Surface defect detection method, device, computer equipment and storage medium | |
Zhang et al. | Zero-DD: Zero-sample defect detection for industrial products | |
Zhang et al. | An automatic fault detection method of freight train images based on BD-YOLO | |
Bahrami et al. | An end-to-end framework for shipping container corrosion defect inspection | |
JP7059889B2 (en) | Learning device, image generator, learning method, and learning program | |
Xu et al. | Hybrid attention-aware transformer network collaborative multiscale feature alignment for building change detection | |
Huang et al. | Railway infrastructure defects recognition using fine-grained deep convolutional neural networks | |
CN117523363A (en) | Wafer map defect mode identification method based on feature pyramid fusion | |
CN113947567B (en) | Defect detection method based on multitask learning | |
Hu et al. | Hybrid Pixel‐Level Crack Segmentation for Ballastless Track Slab Using Digital Twin Model and Weakly Supervised Style Transfer | |
CN115082650A (en) | Implementation method of automatic pipeline defect labeling tool based on convolutional neural network | |
Tatu et al. | Fault Detection In Bottle Caps And Label Alignment Using Convolutional Neural Network | |
CN117994500A (en) | Multi-target fault detection method based on improvement YOLOv s |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |