TW202147254A

TW202147254A - Method for labeling image

Info

Publication number: TW202147254A
Application number: TW109119845A
Authority: TW
Inventors: 陳怡君; 陳佩君; 文柏陳; 陳維超
Original assignee: 英業達股份有限公司
Priority date: 2020-06-12
Filing date: 2020-06-12
Publication date: 2021-12-16
Also published as: TWI803756B

Abstract

A method for labeling an image comprises: obtaining a target image of a target object; generating a reconstructed image according to the target image and a reconstruction model, wherein the reconstruction model is trained with a plurality of reference images and a machine learning algorithm, each of the reference image is an image of a reference object whose defect level is in a tolerable range with an upper limit, and each of the reference objects is associated with the target object; performing a first difference algorithm and a second difference algorithm respectively according to the target image and the reconstructed image to generate a first image and a second image respectively; and performing a pixel-wise operation according to the first image and the second image to generate an output image, wherein the output image includes a mark of a defect of the target object.

Description

How to tag images

本發明係關於影像處理領域，特別是一種標記影像中的物件的缺陷的方法。The present invention relates to the field of image processing, in particular to a method for marking defects of objects in an image.

在最終出貨給客戶之前，筆記型電腦或平板電腦等產品需要由品質管理人員檢查和確認。這些品質管理人員依據教驗準則檢查產品是否具有刮痕、凹痕或其他缺陷。如果缺陷的嚴重程度超出規格所允許的範圍，則電腦產品被視為「不合格」；反之，則電腦產品被視為「通過」缺陷檢測。Products such as laptops or tablets need to be checked and confirmed by quality control personnel before final shipment to customers. These quality managers check products for scratches, dents, or other defects based on a learned guideline. If the severity of the defect exceeds the range allowed by the specification, the computer product is deemed to be "failed"; otherwise, the computer product is deemed to be "passed" for defect detection.

為了檢測電腦產品外觀上的缺陷，可以蒐集電腦產品的多個表面影像，在影像上標記缺陷類型，然後訓練用於在自動光學檢查（Automatic Optical Inspection，AOI）機器中進行缺陷偵測的機器學習或深度學習（deep learning）模型。傳統上，物件偵測（detection）和分類（classification）均以監督方式進行。在監督式學習的狀況中，為了提高偵測準確度需要蒐集大量帶有標記（label）的訓練資料，包括正常的樣本以及有缺陷的樣本。To detect defects in the appearance of computer products, multiple surface images of the computer product can be collected, the defect types are marked on the images, and machine learning for defect detection in an Automatic Optical Inspection (AOI) machine is trained. or deep learning models. Traditionally, object detection and classification are performed in a supervised manner. In the case of supervised learning, in order to improve the detection accuracy, it is necessary to collect a large amount of training data with labels, including normal samples and defective samples.

更多的訓練資料意味著更多的標記工作。然而，訓練資料的蒐集和標記需要大量的人力成本，而且可能相當困難。舉例來說，電腦產品製造商若未設置用於蒐集大數據（尤其是大量影像資料）的基礎建設，而將資料蒐集和標記任務外包，則資料的安全性、完整性和機密性可能會引起極大的關注。更重要的是，隨著電腦產品的生命週期縮短，且產品設計趨向多元，以足夠的多樣性來蒐集和標記電腦表面影像的缺陷顯得不切實際。電腦產品的表面可能是任何顏色，也可以有任何紋理和材質。此外，表面缺陷具有許多類型，例如刮痕、凹痕，污漬等等。同一類型的表面缺陷可能有各種形狀和大小。更嚴重的是，某些表面缺陷不易被歸類。訓練資料中不可避免地會出現不一致的標記。常規方法需要在訓練資料中正確地對缺陷檢測進行分類和標記，以期具有良好的準確性。因此，很難收集具有足夠種類的大量一致的標記資料。而等到蒐集和標記足夠的訓練影像之後，這些訓練影像所對應的電腦產品可能已經接近下市。More training data means more labeling work. However, the collection and labeling of training data is labor-intensive and can be quite difficult. For example, if a computer product manufacturer does not have the infrastructure for collecting big data (especially large amounts of image data) and outsources data collection and labelling tasks, the security, integrity and confidentiality of data may cause problems. great attention. More importantly, as the life cycle of computer products is shortened and product designs tend to be diverse, it is impractical to collect and mark defects in computer surface images with sufficient diversity. The surface of a computer product can be any color, and can have any texture and texture. Furthermore, surface defects are of many types, such as scratches, dents, stains, and the like. Surface defects of the same type can come in various shapes and sizes. More seriously, some surface defects are not easy to classify. Inconsistent labels will inevitably appear in the training data. Conventional methods need to correctly classify and label defect detections in the training data in order to have good accuracy. Therefore, it is difficult to collect a large amount of consistent marker data with sufficient variety. After collecting and labeling enough training images, the computer products corresponding to these training images may be close to the market.

有鑑於此，本發明提出一種標記影像的方法以滿足大量訓練資料的需求。In view of this, the present invention proposes a method for labeling images to meet the needs of a large amount of training data.

依據本發明一實施例敘述的一種標記影像的方法，所述的方法包括：取得目標物件之目標影像；依據目標影像及重建模型產生重建影像，其中重建模型係採用多個參考影像及一機器學習演算法訓練而得，每一參考影像為一參考物件之影像且參考物件的缺陷程度落於具有上限值的一可容許範圍內，每一參考物件關聯於目標物件；依據目標影像及重建影像分別執行第一差異演算法及第二差異演算法以分別產生第一差異影像及第二差異影像；以及依據第一差異影像及第二差異影像執行像素尺度運算以產生輸出影像，其中輸出影像包括目標物件之一缺陷之標記。A method for labeling an image according to an embodiment of the present invention, the method includes: obtaining a target image of a target object; generating a reconstructed image according to the target image and a reconstruction model, wherein the reconstruction model adopts a plurality of reference images and a machine learning The algorithm is trained, each reference image is an image of a reference object and the defect degree of the reference object falls within an allowable range with an upper limit, and each reference object is associated with the target object; according to the target image and the reconstructed image respectively executing a first difference algorithm and a second difference algorithm to generate a first difference image and a second difference image, respectively; and performing a pixel scale operation according to the first difference image and the second difference image to generate an output image, wherein the output image includes A marker of a defect in the target object.

綜上所述，本發明提出的標記影像的方法適用於對電腦產品的原始影像進行分類或偵測。本發明減少了大量標記影像以作為訓練資料的需求。本發明提出的重建模型不會過度泛化（over-generalized），以致於將某些缺陷視為正常區域中的紋理圖案，故本發明可降低出現假陰性（false negative）的狀況。本發明藉由僅突出異常而忽略複雜背景的方式來模仿人類的感知。這種基於感知注意（perceptual-attention）的方法有效地減少了假陽性（false positive）的狀況。To sum up, the method for marking images proposed by the present invention is suitable for classifying or detecting original images of computer products. The present invention reduces the need for a large number of labeled images as training data. The reconstruction model proposed by the present invention is not over-generalized, so that some defects are regarded as texture patterns in normal regions, so the present invention can reduce the occurrence of false negatives. The present invention mimics human perception by highlighting only anomalies and ignoring complex contexts. This perceptual-attention-based approach effectively reduces false positives.

以上之關於本揭露內容之說明及以下之實施方式之說明係用以示範與解釋本發明之精神與原理，並且提供本發明之專利申請範圍更進一步之解釋。The above description of the present disclosure and the following description of the embodiments are used to demonstrate and explain the spirit and principle of the present invention, and provide further explanation of the scope of the patent application of the present invention.

以下在實施方式中詳細敘述本發明之詳細特徵以及優點，其內容足以使任何熟習相關技藝者了解本發明之技術內容並據以實施，且根據本說明書所揭露之內容、申請專利範圍及圖式，任何熟習相關技藝者可輕易地理解本發明相關之目的及優點。以下之實施例係進一步詳細說明本發明之觀點，但非以任何觀點限制本發明之範疇。The detailed features and advantages of the present invention are described in detail below in the embodiments, and the content is sufficient to enable any person skilled in the relevant art to understand the technical content of the present invention and implement it accordingly, and according to the content disclosed in this specification, the scope of the patent application and the drawings , any person skilled in the related art can easily understand the related objects and advantages of the present invention. The following examples further illustrate the viewpoints of the present invention in detail, but do not limit the scope of the present invention in any viewpoint.

本發明提出的標記影像的方法，適用於偵測目標物件本身的缺陷，並且在具有目標物件的目標影像中增加關聯於缺陷的標記（supplementary labels）。在一實施例中，目標物件係電腦產品的表面，像是筆記型電腦的上蓋，缺陷則係位於上蓋的刮痕、凹痕或污漬等。在另一實施例中，目標物件係印刷電路板，缺陷則係零件的缺件、偏斜或錯件等。The method for labeling an image provided by the present invention is suitable for detecting defects of the target object itself, and adding supplementary labels related to the defects in the target image with the target object. In one embodiment, the target object is the surface of a computer product, such as the top cover of a notebook computer, and the defect is scratches, dents or stains on the top cover. In another embodiment, the target object is a printed circuit board, and the defect is a missing part, a skewed part, or a wrong part of the part.

請參考圖1，其繪示本發明一實施例的標記影像的方法的流程圖。請參考步驟S0，依據多個參考影像及一機器學習演算法產生一重建模型。多個參考影像的每一者為一參考物件之影像，且此參考物件的缺陷程度落於具有上限值的一可容許範圍內。每一個參考物件關聯於目標物件。詳言之，參考物件為目標物件的合格樣本，或稱為可容許樣本。參考物件例如係筆記型電腦的上蓋。依據規格要求，參考物件不具有缺陷，或參考物件的缺陷的數量及程度落於可容許範圍內。舉例來說，請參考表格一，可容許範圍例如包括第一級及第二級的缺陷類型，且上限值為第二級中定義的最大邊界值（20 mm、2條、1 mm² 及/或2個）或者例如包括第一級的缺陷類型，且上限值為第一級中定義的最大邊界值（12 mm、2條、0.7 mm² 及/或3個）。為便於敘述，後文將「缺陷程度落於可容許範圍內」簡稱為「不具有缺陷」。Please refer to FIG. 1 , which is a flowchart of a method for marking an image according to an embodiment of the present invention. Referring to step S0, a reconstruction model is generated according to a plurality of reference images and a machine learning algorithm. Each of the plurality of reference images is an image of a reference object, and the defect degree of the reference object falls within an allowable range with an upper limit value. Each reference object is associated with a target object. Specifically, the reference object is a qualified sample of the target object, or referred to as an allowable sample. The reference object is, for example, the top cover of a notebook computer. According to the specification requirements, the reference object has no defects, or the number and degree of defects in the reference object are within the allowable range. For example, please refer to Table 1, the allowable range includes, for example, the defect types of the first level and the second level, and the upper limit is the maximum boundary value defined in the second level (20 mm, 2 bars, 1 mm ² and /or 2) or, for example, includes the defect types of the first class, and the upper limit is the maximum boundary value defined in the first class (12 mm, 2 strips, 0.7 mm ² and/or 3). For the convenience of description, "the degree of defect is within the allowable range" is abbreviated as "without defect" hereinafter.

表格一缺陷類型第一級第二級第三級刮痕長度：12 mm 合格：2條長度：20 mm 合格：2條長度：25 mm 合格：1條凹陷 0.5 mm² ~ 0.7 mm² 合格：3個 0.5 mm² ~ 1 mm² 合格：2個 1 mm² ~ 1.3 mm² 合格：1個 Form 1 Defect type first level second level third level scratches Length: 12 mm Pass: 2 Length: 20 mm Pass: 2 Length: 25 mm Pass: 1 Sag 0.5 mm ² ~ 0.7 mm ² pass: 3 0.5 mm ² ~ 1 mm ² Pass: 2 1 mm ² ~ 1.3 mm ² pass: 1

在一實施例中，步驟S0所述的機器學習演算法係自編碼器（auto-encoder）。在另一實施例中，步驟S0所述的機器學習演算法係單一類別的支持向量機（one-class support vector machine）。機器學習演算法以拍攝多個參考物件所獲得的多個參考影像作為訓練資料並訓練出重建模型。重建模型，或稱為生成模型（generative model），其係用來描述合格樣本的模型。在步驟S0的重建模型預先訓練完成之後，圖1所示的步驟S1~S5為實際運作階段。In one embodiment, the machine learning algorithm described in step S0 is an auto-encoder. In another embodiment, the machine learning algorithm described in step S0 is a one-class support vector machine (one-class support vector machine). The machine learning algorithm uses multiple reference images obtained by shooting multiple reference objects as training data and trains a reconstruction model. Reconstruction models, or generative models, are models used to describe qualified samples. After the pre-training of the reconstructed model in step S0 is completed, steps S1 to S5 shown in FIG. 1 are the actual operation stage.

請參考步驟S1，取得目標影像。本步驟S1例如以攝像裝置拍攝目標物件之目標影像。目標物件例如是筆記型電腦的上蓋或印刷電路板。為便於說明，目標物件具有一或多個在可容許範圍外的缺陷。然而在執行本發明提出的標記影像的方法之後，也有可能出現「目標物件不具有缺陷」的狀況。Please refer to step S1 to obtain the target image. In this step S1, a target image of the target object is captured by, for example, a camera device. The target object is, for example, a top cover of a notebook computer or a printed circuit board. For ease of illustration, the target object has one or more defects that are outside the allowable range. However, after the method for marking images proposed by the present invention is executed, a situation of "the target object does not have defects" may also occur.

請參考步驟S2，依據目標影像及重建模型產生重建影像。舉例來說，攝像裝置將步驟S1取得的目標影像發送至處理器。處理器依據目標影像及重建模型產生一重建（reconstruction）影像。重建影像可被視為「不具有缺陷」的目標影像。重建模型產生重建影像的方式包括但不限於：從多個候選重建影像中挑選一者、從多個特徵原型中以線性組合方式產生重建影像、或者依據一影像轉換函數輸出重建影像。Please refer to step S2 to generate a reconstructed image according to the target image and the reconstructed model. For example, the camera device sends the target image obtained in step S1 to the processor. The processor generates a reconstruction image according to the target image and the reconstruction model. The reconstructed image can be considered a "defect-free" target image. The way of generating the reconstructed image from the reconstruction model includes, but is not limited to, selecting one of a plurality of candidate reconstructed images, generating a reconstructed image from a plurality of feature prototypes in a linear combination manner, or outputting the reconstructed image according to an image transformation function.

若目標影像中的目標物件具有缺陷，則在步驟S2產生重建影像之後，重建影像與目標影像之間具有重建誤差（reconstruction error）。請參考步驟S3及步驟S4。步驟S3係處理器依據目標影像及重建影像執行第一差異演算法以產生第一差異影像，步驟S4係處理器依據目標影像及重建影像執行第二差異演算法以產生第二差異影像。步驟S3及步驟S4係處理器依據不同的比對尺度去計算重建誤差。步驟S3及步驟S4可同時被執行，亦可先後被執行，本發明並不限制處理器執行步驟S3及步驟S4的順序。If the target object in the target image has defects, after the reconstructed image is generated in step S2, there is a reconstruction error between the reconstructed image and the target image. Please refer to step S3 and step S4. In step S3, the processor executes a first difference algorithm according to the target image and the reconstructed image to generate a first difference image, and in step S4, the processor executes a second difference algorithm according to the target image and the reconstructed image to generate a second difference image. In steps S3 and S4, the processor calculates the reconstruction error according to different comparison scales. Steps S3 and S4 can be executed simultaneously or sequentially, and the present invention does not limit the order in which the processor executes steps S3 and S4.

請參考圖2，其繪示本發明一實施例中步驟S3的細部流程圖。Please refer to FIG. 2 , which shows a detailed flowchart of step S3 in an embodiment of the present invention.

請參考步驟S31及步驟S32。步驟S31係依據目標影像及一神經網路模型產生第一特徵圖（feature map）。步驟S32係依據重建影像及此神經網路模型產生第二特徵圖。所述的第一特徵圖及第二特徵圖中分別具有一或多個特徵區塊，這些特徵區塊代表特徵圖需要被注意的部分。特徵區塊的大小例如為長寬各64像素的矩形區塊（patch），本發明並不限制區塊的長寬尺寸。特徵圖亦可被稱為深度特徵（deep feature）。Please refer to step S31 and step S32. Step S31 is to generate a first feature map according to the target image and a neural network model. Step S32 is to generate a second feature map according to the reconstructed image and the neural network model. The first feature map and the second feature map respectively have one or more feature blocks, and these feature blocks represent parts of the feature map that need attention. The size of the feature block is, for example, a rectangular patch with a length and a width of 64 pixels, and the present invention does not limit the length and width of the patch. Feature maps can also be called deep features.

在一實施例中，步驟S31及步驟S32所用的神經網路模型例如係SqueezeNet。在其他實施例中，神經網路模型可為AlexNet或ResNet。在一實施例中，神經網路模型係預先以大型視覺資料庫（例如ImageNet）中的多個影像進行訓練，這些影像與目標物件無關聯。在訓練時，以每個影像的每一像素擷取包含此像素之矩形區塊（例如為長寬各64像素的矩形區塊）作為訓練資料。在另一實施例中，首先以與目標物件無關聯的多個圖像訓練出神經網路模型，然後以包括以關聯於目標物件的多個影像微調此神經網路模型，藉此提高特徵萃取的準確度。訓練後的神經網路模型在特徵萃取（feature extraction）階段輸出的特徵圖具有相似於人類視覺感知的特徵辨識策略。In one embodiment, the neural network model used in steps S31 and S32 is, for example, SqueezeNet. In other embodiments, the neural network model may be AlexNet or ResNet. In one embodiment, the neural network model is pre-trained with multiple images from a large visual database (eg, ImageNet) that are not associated with the target object. During training, each pixel of each image is used to extract a rectangular block (for example, a rectangular block with a length and width of 64 pixels) including the pixel as training data. In another embodiment, the neural network model is first trained with multiple images not associated with the target object, and then the neural network model is fine-tuned including multiple images associated with the target object, thereby improving feature extraction accuracy. The feature map output by the trained neural network model in the feature extraction stage has a feature identification strategy similar to that of human visual perception.

請參考步驟S33，計算第一特徵圖及第二特徵圖之差異作為第一差異影像。舉例來說，第一差異影像為第一特徵圖與第二特徵圖進行相減。第一差異影像係感知注意圖（perceptual attention map），此感知注意圖具有模仿人類比對影像差異的效果。詳言之，人類在比對參考影像及目標影像時，並不會特別注意到兩個影像是否有輕微的位移或微小的差異，而是容易觀察到兩個影像中特徵區塊的差異。步驟S31~S33所述的第一差異演算法係從區塊角度計算粗略級別（coarse level）的重建誤差。Please refer to step S33 to calculate the difference between the first feature map and the second feature map as a first difference image. For example, the first difference image is the subtraction of the first feature map and the second feature map. The first difference image is a perceptual attention map, which has the effect of imitating human comparison of image differences. To be more specific, when humans compare the reference image and the target image, they do not particularly notice whether there is a slight displacement or a slight difference between the two images, but can easily observe the difference between the feature blocks in the two images. The first disparity algorithm described in steps S31 to S33 calculates the reconstruction error at a coarse level from a block perspective.

一般而言，自編碼器採用L2損失函數（loss function）或結構相似性指標（structural similarity index，SSIM）來計算目標影像與重建影像之間的重建誤差。然而，這些指標通常對輕微的整體變化敏感，因此，當比對重點放在紋理圖案相似性（texture pattern similarity）而不是精確對齊上時，上述的指標將無法很好地發揮作用。即使目標影像中的目標物件的缺陷程度並不嚴重，若目標影像與重建影像之間具有少量的位移，上述的指標還有可能增加不必要的重建誤差。因此，本發明採用步驟S31~S33所介紹的第一差異演算法以強調高階結構與特徵表示的匹配性。整體而言，應用第一差異演算法所產生的第一差異影像具有強調感興趣區域（region of interest，ROI）及減少背景噪聲（noise）的效果。In general, the autoencoder uses an L2 loss function or a structural similarity index (SSIM) to calculate the reconstruction error between the target image and the reconstructed image. However, these metrics are usually sensitive to slight overall changes, so the above metrics will not work well when alignments focus on texture pattern similarity rather than precise alignment. Even if the defect degree of the target object in the target image is not serious, if there is a small amount of displacement between the target image and the reconstructed image, the above-mentioned indicators may increase unnecessary reconstruction errors. Therefore, the present invention adopts the first difference algorithm introduced in steps S31 to S33 to emphasize the matching between higher-order structures and feature representations. In general, the first difference image generated by applying the first difference algorithm has the effect of emphasizing a region of interest (ROI) and reducing background noise.

請參考步驟S4，依據目標影像及重建影像執行第二差異演算法以產生第二差異影像。第二差異演算法係處理器依據重建影像及目標影像的每一像素計算一相對誤差值（relative error）。所述的相對誤差值例如係兩張影像中每個像素的平方誤差（pixel-wise square error）或每個像素的絕對值誤差（pixel-wise absolute difference）。本步驟S4係處理器以像素級別運算得到目標影像中的目標物件的缺陷位置。Referring to step S4, a second difference algorithm is executed according to the target image and the reconstructed image to generate a second difference image. The second difference algorithm is that the processor calculates a relative error according to each pixel of the reconstructed image and the target image. The relative error value is, for example, the pixel-wise square error of each pixel in the two images or the pixel-wise absolute difference of each pixel. In this step S4 , the processor obtains the defect position of the target object in the target image by calculating at the pixel level.

請參考步驟S5，依據第一差異影像及第二差異影像執行像素尺度（pixel-wise）運算以產生第一輸出影像。在一實施例中，所述的像素尺度運算係位元乘法。詳言之，步驟S5中，對於第一差異影像的某一位置及第二差異影像的相同位置，若處理器判斷這兩個位置的像素值皆指示為缺陷，則第一輸出影像將保留此位置的缺陷。若第一差異影像及第二差異影像中僅有一者判斷某一位置的像素值指示為缺陷，則第一輸出影像不保留此位置的缺陷。Referring to step S5, a pixel-wise operation is performed according to the first difference image and the second difference image to generate a first output image. In one embodiment, the pixel scale operation is a bitwise multiplication. To be more specific, in step S5, for a certain position of the first difference image and the same position of the second difference image, if the processor determines that the pixel values of these two positions are both indicated as defects, the first output image will keep this position. Location flaws. If only one of the first difference image and the second difference image determines that the pixel value of a certain position indicates a defect, the first output image does not retain the defect at this position.

在一實施例中，在步驟S5執行完成後，處理器可依據第一輸出影像中每一像素是否指示為缺陷而標記出第一輸出影像中的缺陷。在另一實施例中，為進一步減少假陽性（false positive）的狀況，在步驟S5執行完成後，處理器可繼續執行步驟S6以提高標記的精確度。In one embodiment, after step S5 is completed, the processor may mark defects in the first output image according to whether each pixel in the first output image is indicated as a defect. In another embodiment, in order to further reduce false positives, after step S5 is performed, the processor may continue to perform step S6 to improve the marking accuracy.

請參考步驟S6，依據第一輸出影像執行多閾值生成程序以產生第二輸出影像及標記，其中第一閾值大於大二閾值。第一閾值用於獲取可能是缺陷的像素，第二閾值用於將這些可能是缺陷的像素擴展到週邊的像素。Referring to step S6, a multi-threshold generating process is performed according to the first output image to generate a second output image and a marker, wherein the first threshold is greater than the second threshold. The first threshold is used to obtain pixels that may be defective, and the second threshold is used to extend these pixels that may be defective to surrounding pixels.

請參考圖3，其繪示步驟S6所述的多閾值生成程序的流程圖。請參考步驟S61及步驟S62。步驟S61係處理器依據第一閾值對第一輸出影像進行二值化（binarization）程序以產生第三輸出影像，步驟S62係處理器依據第二閾值對第一輸出影像進行二值化程序以產生第四輸出影像。步驟S61及步驟S62係依據不同的閾值對第一輸出影像進行處理。步驟S61及步驟S62可同時被執行，亦可先後被執行，本發明並不限制執行步驟S61及步驟S62的順序。在一實施例中，處理器計算多個參考影像與重建影像之間的重建誤差的平均值A及標準差S（standard deviation），並且將第一閾值設定為A+4S，將第二閾值設定為A+S。Please refer to FIG. 3 , which shows a flow chart of the multi-threshold generating procedure described in step S6 . Please refer to step S61 and step S62. In step S61, the processor performs a binarization process on the first output image according to the first threshold to generate a third output image, and in step S62, the processor performs a binarization process on the first output image according to the second threshold to generate a The fourth output image. Steps S61 and S62 process the first output image according to different thresholds. Steps S61 and S62 can be executed simultaneously or sequentially, and the present invention does not limit the sequence of executing steps S61 and S62. In one embodiment, the processor calculates the average value A and the standard deviation S (standard deviation) of the reconstruction errors between the plurality of reference images and the reconstructed images, and sets the first threshold as A+4S and the second threshold as A+4S for A+S.

請參考步驟S63，選取第三輸出影像中之缺陷區塊。詳言之，從高閾值處理後的第三輸出影像可捕捉到缺陷的部分。請參考步驟S64，依據第四輸出影像中對應於缺陷區塊之位置，判斷此位置週邊之像素是否具有缺陷以選擇性地擴展缺陷區塊。舉例來說，若第三輸出影像中被選取的缺陷區塊之中心點座標為(123, 45)，則處理器在第四輸出影像中尋找座標為(123, 45)的像素的週邊像素，包括座標為(122, 45)、(124, 45)、(123, 43)、(123, 46)等像素，然後判斷這些第四輸出影像的像素是否為缺陷。如果判斷結果為「是」，則處理地在第二輸出影像中保留缺陷區塊及亦為缺陷的週邊像素。在一實施例中，步驟S64可採用例如flood fill演算法以產生包含缺陷區塊的連通圖。Please refer to step S63 to select the defective block in the third output image. In particular, defective portions can be captured from the high threshold processed third output image. Referring to step S64, according to the position corresponding to the defective block in the fourth output image, it is determined whether the pixels around the position have defects so as to selectively expand the defective block. For example, if the coordinate of the center point of the selected defective block in the third output image is (123, 45), the processor searches for the surrounding pixels of the pixel whose coordinate is (123, 45) in the fourth output image, Including pixels with coordinates (122, 45), (124, 45), (123, 43), (123, 46), etc., and then judging whether these pixels of the fourth output image are defects. If the determination result is "Yes", the defective block and the surrounding pixels that are also defective are retained in the second output image. In one embodiment, step S64 may employ, for example, a flood fill algorithm to generate a connectivity graph including defective blocks.

依據步驟S6所產生的第二輸出影像，處理器判斷其中屬於缺陷的像素，進一步對這些缺陷進行標記。本發明在步驟S6中提出的多閾值生成程序，可減少假陽性的影像標記。According to the second output image generated in step S6, the processor determines the pixels belonging to defects, and further marks these defects. The multi-threshold generating procedure proposed in the present invention in step S6 can reduce false positive image markers.

圖4至圖11是執行圖1~圖3之步驟後所產生的影像的範例。4 to 11 are examples of images generated after the steps of FIGS. 1 to 3 are performed.

請參考圖4，其係執行步驟S1後取得的目標影像的範例。圖4中的目標物件為印刷電路板及一電路元件，可看出電路元件具有三根引腳，中間的引腳未正確地插入電路板上的插孔中。Please refer to FIG. 4 , which is an example of the target image obtained after step S1 is executed. The target object in FIG. 4 is a printed circuit board and a circuit component. It can be seen that the circuit component has three pins, and the middle pin is not correctly inserted into the socket on the circuit board.

請參考圖5，其係執行步驟S2後產生的重建影像的範例。從圖5可知「不具有缺陷」的目標物件中的電路元件的每個引腳皆插入插孔內。Please refer to FIG. 5 , which is an example of a reconstructed image generated after step S2 is executed. It can be seen from FIG. 5 that each pin of the circuit element in the target object "without defects" is inserted into the socket.

請參考圖6，其係執行步驟S3後產生的第一差異影像的範例。從圖6可辨識第一差異影像的下半部具有一白色區域，相較於第一差異影像的上半部更具有辨識性。Please refer to FIG. 6 , which is an example of the first difference image generated after step S3 is executed. It can be seen from FIG. 6 that the lower half of the first difference image has a white area, which is more recognizable than the upper half of the first difference image.

請參考圖7，其係執行步驟S4後產生的第二差異影像的範例。圖7係以像素尺度呈現重建誤差，因此從圖7可看到更多對應於圖5的細節。Please refer to FIG. 7 , which is an example of the second difference image generated after step S4 is executed. Figure 7 presents the reconstruction error in pixel scale, so more detail corresponding to Figure 5 can be seen from Figure 7 .

請參考圖8，其係執行步驟S5後產生的第一輸出影像的範例。對於電路元件的缺陷部位與周邊的對比度，圖8相較於圖6及圖7更高。Please refer to FIG. 8 , which is an example of the first output image generated after step S5 is executed. The contrast between the defect portion and the periphery of the circuit element is higher in FIG. 8 than in FIGS. 6 and 7 .

請參考圖9，其係執行步驟S62後產生的第三輸出影像的範例。圖9係依據圖8及第二閾值執行二值化程序後的結果。Please refer to FIG. 9 , which is an example of the third output image generated after step S62 is executed. FIG. 9 is the result of performing the binarization process according to FIG. 8 and the second threshold.

請參考圖10，其係執行步驟S61後產生的第四輸出影像的範例。圖10係依據圖8及第一閾值執行二值化程序後的結果。從圖10可明顯看出缺陷所在的位置。Please refer to FIG. 10 , which is an example of the fourth output image generated after step S61 is executed. FIG. 10 is the result of performing the binarization process according to FIG. 8 and the first threshold. The location of the defect is evident from Figure 10.

請參考圖11，其係執行步驟S64後產生的第二輸出影像以及標記的範例。標記為圖11中標示缺陷位置的方框。Please refer to FIG. 11 , which is an example of the second output image and the mark generated after step S64 is executed. Labeled as the box in Figure 11 indicating the location of the defect.

請參考圖12，其係本發明一實施例中人工標記影像中的缺陷的範例。從圖12可知採用本發明所得到的標記與真實結果極為接近。Please refer to FIG. 12 , which is an example of manually marking defects in an image according to an embodiment of the present invention. It can be seen from Fig. 12 that the markers obtained by using the present invention are very close to the real results.

實務上，在執行如圖1所示的流程之後，得到的具有標記的影像例如可提供以區域基礎的卷積神經網路（Region-based Convolutional Neural Network，R-CNN）實現的缺陷偵測模型。所述的區域基礎的卷積神經網路例如：Fast R-CNN、Faster R-CNN、Mask R-CNN、YOLO（You Only Look Once）或SSD（Single Shot Detection）。In practice, after executing the process shown in Figure 1, the obtained labeled images can provide, for example, a defect detection model implemented by a Region-based Convolutional Neural Network (R-CNN). . The area-based convolutional neural networks are, for example, Fast R-CNN, Faster R-CNN, Mask R-CNN, YOLO (You Only Look Once) or SSD (Single Shot Detection).

綜上所述，本發明提出的標記影像的方法適用於對電腦產品的原始影像進行分類或偵測。本發明減少了大量標記影像以作為訓練資料的需求。本發明提出的重建模型不會過度泛化（over-generalized），以致於將某些缺陷視為正常區域中的紋理圖案，故本發明可降低出現假陰性（false negative）的狀況。本發明藉由僅突出異常而忽略複雜背景的方式來模仿人類的感知。這種基於感知注意（perceptual-attention）的方法有效地減少了假陽性的狀況。To sum up, the method for marking images proposed by the present invention is suitable for classifying or detecting original images of computer products. The present invention reduces the need for a large number of labeled images as training data. The reconstruction model proposed by the present invention is not over-generalized, so that some defects are regarded as texture patterns in normal regions, so the present invention can reduce the occurrence of false negatives. The present invention mimics human perception by highlighting only anomalies and ignoring complex contexts. This perceptual-attention-based approach effectively reduces false positives.

雖然本發明以前述之實施例揭露如上，然其並非用以限定本發明。在不脫離本發明之精神和範圍內，所為之更動與潤飾，均屬本發明之專利保護範圍。關於本發明所界定之保護範圍請參考所附之申請專利範圍。Although the present invention is disclosed in the foregoing embodiments, it is not intended to limit the present invention. Changes and modifications made without departing from the spirit and scope of the present invention belong to the scope of patent protection of the present invention. For the protection scope defined by the present invention, please refer to the attached patent application scope.

S0~S6:步驟 S31~S33:步驟 S61~S64:步驟S0~S6: Steps S31~S33: Steps S61~S64: Steps

圖1係依據本發明一實施例的標記影像的方法繪示的流程圖。圖2係繪示本發明一實施例中步驟S3的細部流程圖。圖3係繪示本發明一實施例中多閾值生成程序的流程圖。圖4係本發明一實施例中目標影像的範例。圖5係本發明一實施例中重建影像的範例。圖6係本發明一實施例中第一差異影像的範例。圖7係本發明一實施例中第二差異影像的範例。圖8係本發明一實施例中第一輸出影像的範例。圖9係本發明一實施例中第三輸出影像的範例。圖10係本發明一實施例中第四輸出影像的範例。圖11係本發明一實施例中第二輸出影像的範例。圖12係本發明一實施例中人工標記影像中的缺陷的範例。FIG. 1 is a flowchart illustrating a method for marking an image according to an embodiment of the present invention. FIG. 2 is a detailed flowchart of step S3 in an embodiment of the present invention. FIG. 3 is a flowchart illustrating a multi-threshold generation process according to an embodiment of the present invention. FIG. 4 is an example of a target image in an embodiment of the present invention. FIG. 5 is an example of a reconstructed image according to an embodiment of the present invention. FIG. 6 is an example of a first difference image in an embodiment of the present invention. FIG. 7 is an example of a second difference image in an embodiment of the present invention. FIG. 8 is an example of a first output image in an embodiment of the present invention. FIG. 9 is an example of a third output image in an embodiment of the present invention. FIG. 10 is an example of a fourth output image in an embodiment of the present invention. FIG. 11 is an example of a second output image in an embodiment of the present invention. FIG. 12 is an example of manually marking defects in an image according to an embodiment of the present invention.

S0~S6:步驟S0~S6: Steps

Claims

A method of marking an image, the method comprising: obtaining a target image of a target object; generating a reconstructed image according to the target image and a reconstruction model, wherein the reconstruction model adopts a plurality of reference images and a machine learning algorithm obtained from training, each of the reference images is an image of a reference object and the defect degree of the reference object falls within an allowable range with an upper limit, and each of the reference objects is associated with the target object; according to the target The image and the reconstructed image respectively execute a first difference algorithm and a second difference algorithm to generate a first difference image and a second difference image respectively; and execute a difference according to the first difference image and the second difference image Pixel scale operations are performed to generate an output image, wherein the output image includes an indication of a defect of the target object.

The method of labeling an image of claim 1, wherein the reconstruction model is an autoencoder.

The method for labeling an image as claimed in claim 1, wherein the first difference algorithm comprises: generating a first feature map according to the target image and a neural network model; generating a first feature map according to the reconstructed image and the neural network model two feature maps; and calculating a difference degree between the first feature map and the second feature map, wherein the first difference image includes the difference degree.

The method for labeling an image as claimed in claim 3, wherein the neural network model is SqueezeNet.

The method for labeling an image as claimed in claim 3, wherein the neural network model is trained on a plurality of images unrelated to the target object.

The method for marking an image as claimed in claim 1, wherein the second difference algorithm comprises: calculating a relative error value according to each pixel of the reconstructed image and the target image.

A method of labeling an image as claimed in claim 6, wherein the relative error value is a squared error or an absolute value error.

The method of marking an image of claim 1, wherein the pixel scale operation is a bitwise multiplication.

The method for marking an image as claimed in claim 1, wherein the output image is a first output image, and after the pixel scale operation is performed according to the first difference image and the second difference image to generate the output image, further comprising: : perform a binarization process on the first output image to generate a third output image and a fourth output image according to a first threshold and a second threshold, wherein the first threshold is greater than the second threshold; select the A defective block in the third output image; and according to a position corresponding to the defective block in the fourth output image, determining whether pixels around the position have defects to selectively extend the defective block.