TWI703511B - Screen crack detection system - Google Patents

Screen crack detection system Download PDF

Info

Publication number
TWI703511B
TWI703511B TW108124062A TW108124062A TWI703511B TW I703511 B TWI703511 B TW I703511B TW 108124062 A TW108124062 A TW 108124062A TW 108124062 A TW108124062 A TW 108124062A TW I703511 B TWI703511 B TW I703511B
Authority
TW
Taiwan
Prior art keywords
screen
detection system
crack detection
model
network model
Prior art date
Application number
TW108124062A
Other languages
Chinese (zh)
Other versions
TW202103052A (en
Inventor
趙式隆
林奕辰
沈昇勳
王彥稀
林哲賢
Original Assignee
洽吧智能股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 洽吧智能股份有限公司 filed Critical 洽吧智能股份有限公司
Priority to TW108124062A priority Critical patent/TWI703511B/en
Application granted granted Critical
Publication of TWI703511B publication Critical patent/TWI703511B/en
Publication of TW202103052A publication Critical patent/TW202103052A/en

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

A screen crack detection system includes an input module, a feature extraction module and a target frame selection module. The input module is configured to receive a screen image. The feature extraction module includes a first convolutional neural network model that converts the screen image into a first feature. The target frame selection module includes a region suggestion network model, and the region suggestion network model receives the first feature transmitted by the feature extraction module, and identifies whether there is at least one cracked region in the screen image. If the cracked region is existed, the cracked region is framed to form at least one target frame.

Description

螢幕裂損檢測系統Screen crack detection system

本發明是指一種檢測系統,特別是一種螢幕裂損檢測系統。The invention refers to a detection system, especially a screen crack detection system.

隨著手機技術發展,手機功能除了通話之外還包括網路、指紋辨識、人臉辨識、攝影、影音等多項功能。而在功能提升的同時,也使手機的價格提高,現今全新手機要價新台幣數萬元,並且手機的維修費用相較過去也大幅提高。 為因應手機價格逐漸攀升,業者也推出了手機保險,透過集體收費的方式來降低使用者在維修手機的費用負擔。一般手機保險是透過手機的損壞程度來決定理賠金額。這當中,手機螢幕的損壞又佔手機損壞的大宗。 目前,手機損壞程度的判斷,特別是外觀損壞通常由理賠人員或維修業者目視判斷,但目視判斷容易有因人而異的問題。此外,也有可能產生維修業者或手機主人串通掛勾,謊報手機的損壞程度,進而詐取更高額的保險金的問題。 專利公開號TW201839666A發明專利揭露一種影像識別方法,是針對汽車外表損毀程度進行辨識,藉此做為汽車損壞保險賠償的依據。然而,汽車本身體積較大,所拍攝的損毀照片通常具有大面積的車體表面,較無背景環境等干擾要素,因此辨識較不會出錯。反之,手機的體積較小,拍攝手機損壞的影像通常會參雜背景環境等其他影像在內,大大增加了識別的難度。因此,專利公開號TW201839666A所揭露的技術特徵並無法直接應用於手機螢幕的損壞辨識。 綜上所述,如何有效的利用人工智慧來辨識手機螢幕的損毀程度,便是本領域具通常知識者值得去思量的。 With the development of mobile phone technology, mobile phone functions include network, fingerprint recognition, face recognition, photography, video and audio, etc. besides calling. While the functions have been improved, the price of mobile phones has also increased. Nowadays, brand-new mobile phones cost tens of thousands of Taiwan dollars, and the maintenance costs of mobile phones have also increased significantly compared with the past. In response to the gradual increase in mobile phone prices, the industry has also introduced mobile phone insurance to reduce the burden of users in repairing mobile phones through collective charging. Generally, mobile phone insurance determines the amount of compensation based on the damage of the mobile phone. Among them, damage to the mobile phone screen accounts for the majority of mobile phone damage. At present, the judgment of the degree of damage to the mobile phone, especially the appearance damage, is usually visually judged by the claims adjuster or repairer, but the visual judgment is prone to vary from person to person. In addition, it is also possible that the repairer or the mobile phone owner colludes to misrepresent the damage of the mobile phone, thereby defrauding a higher insurance premium. Patent Publication No. TW201839666A invention patent discloses an image recognition method that recognizes the degree of damage to the exterior of a car and uses it as a basis for car damage insurance compensation. However, the car itself is large in size, and the damaged photos taken usually have a large area of the car body surface, and there is less interference elements such as background environment, so the identification is less error-free. On the contrary, the size of the mobile phone is small, and the image of the damaged mobile phone is usually mixed with other images such as the background environment, which greatly increases the difficulty of identification. Therefore, the technical features disclosed in Patent Publication No. TW201839666A cannot be directly applied to the damage identification of the mobile phone screen. In summary, how to effectively use artificial intelligence to identify the degree of damage to the screen of a mobile phone is worth considering for those with ordinary knowledge in the field.

本發明提供一種螢幕裂損檢測系統,包括一輸入模組、一特徵抽取模組及一目標框選模組。輸入模組適於接收一螢幕影像。特徵抽取模組包括一第一卷積式神經網路模型,該第一卷積式神經網路模型將該螢幕影像轉換為一第一特徵。目標框選模組,包括一區域建議網路模型,該區域建議網路模型接收該特徵抽取模組所傳送過來的該第一特徵,並辨識該螢幕影像中是否存在有至少一裂損區域,若存在該裂損區域則對該裂損區域進行框選以形成至少一目標框。其中,該螢幕裂損檢測系統於訓練階段包括以下步驟: A10:以多個關鍵字從網際網路上搜尋並收集多個第一參考影像; A20:將上述第一參考影像傳送至一第二卷積式神經網路模型; A30:根據上述第一參考影像,該第二卷積式神經網路模型輸出多個第二特徵; A40:以一非監督模型並依據該第二特徵將上述第一參考影像分群;及 A50:將分群後且含有裂損區域的該第一參考影像輸入到一訓練資料庫中; A90:根據該訓練資料庫所儲存的該第一參考影像對該螢幕裂損檢測系統進行訓練。 上述之螢幕裂損檢測系統,其中,該第一卷積式神經網路模型與該第二卷積式神經網路模型為VGG模型、ResNet模型、DenseNet模型或Inception模型。 上述之螢幕裂損檢測系統,其中,該區域建議網路模型為faster-RCNN模型、YOLO模型、CTPN模型或EAST模型。 上述之螢幕裂損檢測系統,其中,該目標框選模組還包括一二元分類器且透過該二元分類器偵測該螢幕影像中是否存在有該裂損區域。 上述之螢幕裂損檢測系統,其中,其中該螢幕裂損檢測系統於訓練階段還包括以下步驟: A60:選出於螢幕上不含有裂損區域的多個第二參考影像;及 A70:在上述各個第二參考影像上摸擬出至少一裂損區域; A80:將摸擬後的該第二參考影像輸入到該訓練資料庫; 其中,於步驟A90中,是根據該訓練資料庫所儲存的該第一參考影像與該第二參考影像對該螢幕裂損檢測系統進行訓練。 上述之螢幕裂損檢測系統,其中,於步驟A10中,該關鍵字包括破損或裂損。 上述之螢幕裂損檢測系統,其中,於步驟A10中,該關鍵字還包括螢幕或手機。 上述之螢幕裂損檢測系統,其中,該第一特徵為一二維矩陣,該二維矩陣主要是由該第一卷積式神經網路模型的部分隱藏層所構成。 上述之螢幕裂損檢測系統,其中,該第二特徵為一向量。 上述之螢幕裂損檢測系統,其中,於步驟A40中,該非監督模型是執行K-means、DBSCAN、或Expectation Maximization演算法,以對上述第一參考影像分群。 為讓本發明之上述特徵和優點能更明顯易懂,下文特舉較佳實施例,並配合所附圖式,作詳細說明如下。 The present invention provides a screen crack detection system, which includes an input module, a feature extraction module and a target frame selection module. The input module is suitable for receiving a screen image. The feature extraction module includes a first convolutional neural network model, and the first convolutional neural network model converts the screen image into a first feature. The target frame selection module includes a regional suggestion network model. The regional suggestion network model receives the first feature sent by the feature extraction module and identifies whether there is at least one cracked area in the screen image, If the cracked area exists, frame the cracked area to form at least one target frame. Among them, the screen crack detection system includes the following steps in the training phase: A10: Search and collect multiple first reference images from the Internet with multiple keywords; A20: Send the above-mentioned first reference image to a second convolutional neural network model; A30: According to the above-mentioned first reference image, the second convolutional neural network model outputs a plurality of second features; A40: Use an unsupervised model to group the above-mentioned first reference images according to the second feature; and A50: Input the first reference image after grouping and containing the cracked area into a training database; A90: Train the screen crack detection system according to the first reference image stored in the training database. In the above-mentioned screen crack detection system, the first convolutional neural network model and the second convolutional neural network model are VGG model, ResNet model, DenseNet model or Inception model. In the above-mentioned screen crack detection system, the suggested network model for this area is faster-RCNN model, YOLO model, CTPN model or EAST model. In the above-mentioned screen crack detection system, the target frame selection module further includes a binary classifier and detects whether there is the cracked area in the screen image through the binary classifier. In the above-mentioned screen crack detection system, the screen crack detection system further includes the following steps in the training phase: A60: Select multiple second reference images that do not contain cracked areas on the screen; and A70: Simulate at least one cracked area on each of the above-mentioned second reference images; A80: Input the simulated second reference image into the training database; Wherein, in step A90, the screen crack detection system is trained based on the first reference image and the second reference image stored in the training database. In the above-mentioned screen crack detection system, in step A10, the keyword includes broken or cracked. In the above-mentioned screen crack detection system, in step A10, the keyword also includes screen or mobile phone. In the above screen crack detection system, the first feature is a two-dimensional matrix, and the two-dimensional matrix is mainly composed of a part of the hidden layer of the first convolutional neural network model. In the above-mentioned screen crack detection system, the second feature is a vector. In the above-mentioned screen crack detection system, in step A40, the unsupervised model performs K-means, DBSCAN, or Expectation Maximization algorithm to group the above-mentioned first reference image. In order to make the above-mentioned features and advantages of the present invention more comprehensible and understandable, a detailed description is given below of preferred embodiments in conjunction with the accompanying drawings.

參照本文闡述的詳細內容和附圖說明能較佳理解本發明。下面參照附圖會討論各種實施例。然而,本領域技術人員將容易理解,這裡關於附圖給出的詳細描述僅僅是為了解釋的目的,因為這些方法和系統可超出所描述的實施例。例如,所給出的教導和特定應用的需求可能產生多種可選的和合適的方法來實現在此描述的任何細節的功能。因此,任何方法可延伸超出所描述和示出的以下實施例中的特定實施選擇範圍。 在說明書及後續的申請專利範圍當中使用了某些詞彙來指稱特定的元件。所屬領域中具有通常知識者應可理解,不同的廠商可能會用不同的名詞來稱呼同樣的元件。本說明書及後續的申請專利範圍並不以名稱的差異來作為區分元件的方式,而是以元件在功能上的差異來作為區分的準則。在通篇說明書及後續的請求項當中所提及的「包含」或「包括」係為一開放式的用語,故應解釋成「包含但不限定於」。另外,「耦接」或「連接」一詞在此係包含任何直接及間接的電性或通信連接手段。因此,若文中描述一第一裝置耦接於一第二裝置,則代表該第一裝置可直接電性連接於該第二裝置,或透過其他裝置或連接手段間接地電性或通信連接至該第二裝置。 請參閱圖1,圖1所繪示為本發明之螢幕裂損檢測系統的實施例。螢幕裂損檢測系統100包括一輸入模組110、一特徵抽取模組120、一目標框選模組130、與一輸出模組140,其中輸入模組110例如是電性連接到一影像輸入裝置40,此影像輸入裝置40在本實施例中為具有拍照功能的一智慧型手機,但也可為一數位相機。藉由此影像輸入裝置40與輸入模組110,可將拍攝後的一影像10(例如:圖2A所示的相片)匯入到輸入模組110。之後,此輸入模組110再將影像10傳送到特徵抽取模組120,此特徵抽取模組120包括一第一卷積式神經網路模型122,此第一卷積式神經網路模型122將該螢幕影像10轉換為一第一特徵12,並將此第一特徵12傳送到目標框選模組130。目標框選模組130包括一區域建議網路模型132,此區域建議網路模型132接收特徵抽取模組120所傳送過來的第一特徵12,並辨識螢幕影像10中是否存在有至少一裂損區域,若存在裂損區域則對裂損區域進行框選以形成至少一目標框14(在本實施例中共4個目標框14)。之後,輸出模組140連同螢幕影像10與目標框14一併輸出並例如顯示在使用者的顯示螢幕60上,這樣使用者便可清楚辨識螢幕影像10中存在有多少個裂損區域。此外,輸出模組140也可連接到其他的電腦已對輸出的資料進行進一步的處理,例如將螢幕影像10所存在的裂損區域的個數登記到資料庫中。在本實施例中,輸入模組110、特徵抽取模組120、目標框選模組130、與輸出模組140是設置於伺服端,伺服端例如是由一台或多台伺服器所組成。 在上述實施例中,第一特徵12為一二維矩陣,此二維矩陣是由第一卷積式神經網路模型122的多個隱藏層所構成,以下將詳細說明。一般來說,神經網路模型包括一輸入層、多個隱藏層、與一輸出層。因此,如圖3所示,第一卷積式神經網路模型122也是包括一輸入層122a、多個隱藏層122b、與一輸出層122c。其中,每一隱藏層122b中的每一神經元(cell)是由一個數字來進行表示,因此隱藏層122b可視為一個向量,而數個隱藏層122b便可構成一個二維矩陣,此即為第一特徵12。在本實施例中,第一特徵12是由3個最靠近輸出層122c的隱藏層122b所構成,但本領域具有通常知識者也可依狀況調整。 在上述實施例中,第一卷積式神經網路模型122例如為VGG模型、ResNet模型、DenseNet模型或Inception模型。其中,VGG模型是由牛津大學(Oxford)的Visual Geometry Group這個小組所提出的神經網路模型,ResNet模型是深度殘差網絡(Deep residual network)的簡稱,DenseNet模型則是稠密卷積神經網絡(Dense Convolutional Network) 的簡稱。此外,區域建議網路模型132可為faster-RCNN模型、YOLO模型、CTPN模型或EAST模型。其中,Yolo 是You only look once的縮寫,是一種關於物件偵測的神經網路模型。CTPN是連接文本候選框網絡(Connectionist Text Proposal Network)的縮寫,而EAST模型則是揭露於An Efficient and Accurate Scene Text Detector這篇論文的神經網路模型。須注意,上述所提到的神經網路模型都是屬於開源的資料,本領域具有通常知識者可對其進行適當的修改,以符合所要求之目的。 在上述實施例中,目標框選模組130較佳還包括一二元分類器134,此二元分類器134執行與二元分類(Binary classification)相關的演算法,此二元分類器134也可為神經網路模型,或也可執行支援向量機(support vector machine)等演算法。藉由二元分類器134,可更精準地判斷及偵測該螢幕影像10中是否存在有裂損區域。 就神經網路模型來說,在訓練階段時所提供資料的質與量,對於神經網路模型的成效至關重要。於訓練階段時,隨著每次訓練所提供的資料,神經網路模型便會調整各神經元之間連接的參數(亦即:權重)。因此,即使二個神經網路模型的架構相同,但在經過訓練後彼此之間還是會形成相當的差異。也因此,隨著所提供資料的質與量的不同,即使相同的神經網路模型,在成效上也會有相當大的差異。就本案來說,所提供資料的質與量對螢幕裂損檢測系統100的表現也是攸關重大。因此,以下將介紹如何收集相關訓練資料,以供螢幕裂損檢測系統100於訓練階段時進行訓練之用。 請同時參照圖4與圖5,圖4所繪示為資料收集時所用到的模型,圖5所繪示為本發明之螢幕裂損檢測系統於訓練階段的資料收集的流程。首先,請參照步驟S110,將多個關鍵字輸入到搜尋引擎210中以從網際網路上搜尋並收集多張圖片。在此,關鍵字20例如為:手機螢幕、破損,而所使用的搜尋引擎210例如為Google的圖片搜尋。然而,本領域具有通常知識者也可使用其他的關鍵字,例如:裂損、行動裝置等。在此,經由上述關鍵字所收集到的圖片稱為第一參考影像22。接著,請參照步驟S120,將第一參考影像傳送輸入至一第二卷積式神經網路模型220。在本實施例中,第二卷積式神經網路模型220例如為VGG模型、ResNet模型、DenseNet模型或Inception模型。在較佳的實施例中,該第二卷積式神經網路模型220已經過預訓練,例如是已經使用過Imagenet這個資料庫進行預訓練。此第二卷積式神經網路模型220屬於卷積式神經網路(convolutional neural network),主要包括卷積層(convolutional layer)與採樣層(pooling layer)(卷積層與採樣層皆未於圖中繪式),其中卷積層主要用於特徵抽取,而採樣層則是用於減少第二卷積式神經網路模型220所需的參數,以免產生過擬合(overfitting)的情形。 接著,執行步驟S130,第二卷積式神經網路模型220根據所輸入的第一參考影像22產生多個第二特徵24,每一第二特徵24對應到其中一第一參考影像22。在此,第二特徵24例如為一向量。之後,執行步驟S140,以一非監督模型230對第二特徵24進行分群。在本實施例中,非監督模型230是執行K-means、DBSCAN(Density-based spatial clustering of applications with noise,具有噪聲的基於密度的聚類方法)、或Expectation Maximization等非監督式(Unsupervised)分類的演算法。由於每一第二特徵24都對應到其中一第一參考影像22,故對第二特徵24進行分群就是對第一參考影像22進行分群。之後,執行步驟S150,將第一參考影像22輸入到一訓練資料庫240中。 在此,說明為何要進行分群的原因。由於藉由在搜尋引擎210利用關鍵字22所收集到的第一參考影像22並不一定都是所需要的圖片。舉例來說,用手機螢幕、破損為關鍵字所蒐尋到的圖片可能包括手機螢幕完好的圖片或維修店家的圖片,故需要將不相關的圖片去除。然而,由於所收集到的圖片眾多,若單靠人力將不相關的圖片去除是非常沒有效率地。因此,在本實施例中,是藉由非監督模型230對第一參考影像22進行分群,已將較類似的圖片群聚成一類,從而將不相關的圖片去除。此外,為了使非監督模型230在執行上更為快速和有效率,故先用第二卷積式神經網路模型220進行特徵抽取是必要的。在較佳的實施例中,可輔以人工的方式,在已分群的圖片中辨識出不相關的圖片並將其剃除。 完成訓練資料庫240的建置後,便可用訓練資料庫240對螢幕裂損檢測系統100進行訓練。由於訓練資料庫240中的影像和圖片是經由搜尋引擎找到的,故所包含的數量和樣態非常多,也就是說訓練資料庫240除了包含不少手機螢幕破損的影像外,也包含各種不同螢幕大小、機型、裂損型態的手機螢幕破損影像。也由於訓練資料庫240中的圖片的數量和樣態非常多,故螢幕裂損檢測系統100基於訓練資料庫240進行訓練後,可有非常好的成效,能準確判定手機螢幕是否有破損並辨識出裂損區域的個數和範圍。 為了更增加訓練資料庫240中圖片的多樣性,還可藉由如圖6所示的流程來增加訓練資料庫240中圖片的數量。首先,執行步驟S160,選出多個螢幕上不含有裂損區域的圖片(如圖7A所示),在此將這類的圖片稱為第二參考影像30。之後,執行步驟S170,在第二參考影像20’上摸擬出至少一裂損區域32(如圖7B所示)。在此,可以用製圖軟體(如Photoshop)或以編寫程式的方式在第二參考影像30上摸擬出各種各式各樣的裂痕。再來,執行步驟S180,將摸擬後的該第二參考影像30輸入到該訓練資料庫240。 綜上所述,由於螢幕裂損檢測系統100能精確地判定手機螢幕是否有破損並辨識出裂損區域的個數和範圍,故能降低人工判定的成本並無須擔心維修業者謊報手機的損壞程度的問題。此外,本案之螢幕裂損檢測系統100不只可用於判定手機螢幕是否有破損,藉由改變訓練用的資料後還可將其用於判定其他裝置的螢幕是否有破損。 本發明說明如上,然其並非用以限定本創作所主張之專利權利範圍。其專利保護範圍當視後附之申請專利範圍及其等同領域而定。凡本領域具有通常知識者,在不脫離本專利精神或範圍內,所作之更動或潤飾,均屬於本創作所揭示精神下所完成之等效改變或設計,且應包含在下述之申請專利範圍內。 The present invention can be better understood with reference to the detailed content set forth herein and the description of the drawings. Various embodiments will be discussed below with reference to the drawings. However, those skilled in the art will easily understand that the detailed description given here with respect to the drawings is only for explanatory purposes, because these methods and systems may exceed the described embodiments. For example, the given teachings and specific application requirements may produce a variety of alternative and suitable methods to implement any detailed functions described herein. Therefore, any method can extend beyond the specific implementation options described and illustrated in the following embodiments. In the specification and subsequent patent applications, certain words are used to refer to specific elements. Those with ordinary knowledge in the field should understand that different manufacturers may use different terms to refer to the same components. The scope of this specification and subsequent patent applications does not use differences in names as a way to distinguish elements, but uses differences in functions of elements as a criterion for distinguishing. The "include" or "include" mentioned in the entire specification and the subsequent request items is an open term, so it should be interpreted as "include but not limited to". In addition, the term "coupled" or "connected" herein includes any direct and indirect electrical or communication connection means. Therefore, if it is described that a first device is coupled to a second device, it means that the first device can be directly and electrically connected to the second device, or indirectly, electrically or communicatively connected to the second device through other devices or connection means. The second device. Please refer to FIG. 1. FIG. 1 illustrates an embodiment of the screen crack detection system of the present invention. The screen crack detection system 100 includes an input module 110, a feature extraction module 120, a target frame selection module 130, and an output module 140. The input module 110 is, for example, electrically connected to an image input device 40. The image input device 40 is a smart phone with a camera function in this embodiment, but it can also be a digital camera. With the image input device 40 and the input module 110, a captured image 10 (for example, the photo shown in FIG. 2A) can be imported into the input module 110. After that, the input module 110 sends the image 10 to the feature extraction module 120. The feature extraction module 120 includes a first convolutional neural network model 122, and the first convolutional neural network model 122 The screen image 10 is converted into a first feature 12, and the first feature 12 is transmitted to the target frame selection module 130. The target frame selection module 130 includes a regional suggestion network model 132. The regional suggestion network model 132 receives the first feature 12 sent by the feature extraction module 120 and recognizes whether there is at least one crack in the screen image 10 Area, if there is a cracked area, frame the cracked area to form at least one target frame 14 (in this embodiment, there are four target frames 14 in total). After that, the output module 140 together with the screen image 10 and the target frame 14 are output and displayed, for example, on the display screen 60 of the user, so that the user can clearly recognize how many cracked areas exist in the screen image 10. In addition, the output module 140 can also be connected to other computers to perform further processing on the output data, such as registering the number of cracked areas in the screen image 10 in the database. In this embodiment, the input module 110, the feature extraction module 120, the target frame selection module 130, and the output module 140 are arranged on the server side, and the server side is, for example, composed of one or more servers. In the above embodiment, the first feature 12 is a two-dimensional matrix composed of multiple hidden layers of the first convolutional neural network model 122, which will be described in detail below. Generally speaking, a neural network model includes an input layer, multiple hidden layers, and an output layer. Therefore, as shown in FIG. 3, the first convolutional neural network model 122 also includes an input layer 122a, a plurality of hidden layers 122b, and an output layer 122c. Among them, each neuron (cell) in each hidden layer 122b is represented by a number, so the hidden layer 122b can be regarded as a vector, and several hidden layers 122b can form a two-dimensional matrix, which is The first feature 12. In this embodiment, the first feature 12 is composed of three hidden layers 122b closest to the output layer 122c, but those skilled in the art can also adjust it according to the situation. In the foregoing embodiment, the first convolutional neural network model 122 is, for example, a VGG model, a ResNet model, a DenseNet model, or an Inception model. Among them, the VGG model is a neural network model proposed by the Visual Geometry Group of Oxford University, the ResNet model is short for Deep residual network, and the DenseNet model is a dense convolutional neural network ( Short for Dense Convolutional Network). In addition, the regional suggestion network model 132 may be a faster-RCNN model, a YOLO model, a CTPN model or an EAST model. Among them, Yolo is the abbreviation of You only look once, which is a neural network model for object detection. CTPN is the abbreviation of Connectionist Text Proposal Network, and the EAST model is the neural network model disclosed in the paper An Efficient and Accurate Scene Text Detector. It should be noted that the neural network models mentioned above are all open source materials, and those with ordinary knowledge in the field can modify them appropriately to meet the required purpose. In the above embodiment, the target frame selection module 130 preferably further includes a binary classifier 134. The binary classifier 134 executes algorithms related to binary classification. The binary classifier 134 also It can be a neural network model, or algorithms such as support vector machines can also be executed. With the binary classifier 134, it is possible to more accurately determine and detect whether there is a cracked area in the screen image 10. As far as neural network models are concerned, the quality and quantity of data provided during the training phase are critical to the effectiveness of neural network models. In the training phase, with the data provided by each training, the neural network model will adjust the parameters (ie: weights) of the connections between the neurons. Therefore, even if the two neural network models have the same architecture, they will still be quite different after training. Therefore, as the quality and quantity of the data provided are different, even the same neural network model will have considerable differences in effectiveness. As far as this case is concerned, the quality and quantity of the data provided are also critical to the performance of the screen crack detection system 100. Therefore, the following will introduce how to collect relevant training data for the screen crack detection system 100 to perform training during the training phase. Please refer to FIGS. 4 and 5 at the same time. FIG. 4 shows the model used in data collection, and FIG. 5 shows the data collection process of the screen crack detection system of the present invention in the training phase. First, please refer to step S110 to input multiple keywords into the search engine 210 to search and collect multiple pictures from the Internet. Here, the keyword 20 is, for example, mobile phone screen, broken, and the search engine 210 used is, for example, Google's image search. However, those with general knowledge in the field can also use other keywords, such as: cracked, mobile device, etc. Here, the pictures collected through the above keywords are referred to as the first reference image 22. Next, referring to step S120, the first reference image is sent and input to a second convolutional neural network model 220. In this embodiment, the second convolutional neural network model 220 is, for example, a VGG model, a ResNet model, a DenseNet model, or an Inception model. In a preferred embodiment, the second convolutional neural network model 220 has been pre-trained, for example, it has been pre-trained using the imagenet database. This second convolutional neural network model 220 belongs to a convolutional neural network, which mainly includes a convolutional layer and a pooling layer (both convolutional layer and sampling layer are not shown in the figure) Drawing type), where the convolutional layer is mainly used for feature extraction, and the sampling layer is used to reduce the parameters required by the second convolutional neural network model 220 to avoid overfitting. Next, in step S130, the second convolutional neural network model 220 generates a plurality of second features 24 according to the input first reference image 22, and each second feature 24 corresponds to one of the first reference images 22. Here, the second feature 24 is, for example, a vector. After that, step S140 is performed to group the second features 24 with an unsupervised model 230. In this embodiment, the unsupervised model 230 performs K-means, DBSCAN (Density-based spatial clustering of applications with noise, density-based clustering method with noise), or Unsupervised classification such as Expectation Maximization. Algorithm. Since each second feature 24 corresponds to one of the first reference images 22, grouping the second feature 24 is grouping the first reference image 22. After that, step S150 is executed to input the first reference image 22 into a training database 240. Here, explain the reasons for grouping. Because the first reference images 22 collected by using the keywords 22 in the search engine 210 are not necessarily all the required images. For example, the pictures searched by the keywords of mobile phone screen and broken may include pictures of good phone screens or pictures of repair shops, so irrelevant pictures need to be removed. However, due to the large number of collected pictures, it is very inefficient to remove irrelevant pictures by manpower alone. Therefore, in this embodiment, the first reference image 22 is grouped by the unsupervised model 230, and the more similar pictures have been grouped into one category, thereby removing irrelevant pictures. In addition, in order to make the execution of the unsupervised model 230 faster and more efficient, it is necessary to use the second convolutional neural network model 220 for feature extraction first. In a preferred embodiment, manual methods can be supplemented to identify irrelevant pictures from the grouped pictures and shave them. After completing the construction of the training database 240, the training database 240 can be used to train the screen crack detection system 100. Since the images and pictures in the training database 240 are found through search engines, they contain a large number and styles. In other words, the training database 240 contains not only images of broken mobile phone screens, but also a variety of different images. Screen size, model, and cracked type of mobile phone screen damage image. Also due to the large number of pictures in the training database 240, the screen crack detection system 100 can have very good results after training based on the training database 240. It can accurately determine whether the mobile phone screen is damaged and recognize it. The number and scope of the cracked area. In order to increase the diversity of the pictures in the training database 240, the number of pictures in the training database 240 can also be increased through the process shown in FIG. 6. First, perform step S160 to select a plurality of pictures on the screen that do not contain cracked areas (as shown in FIG. 7A), and these pictures are referred to as second reference images 30 herein. After that, step S170 is executed to simulate at least one cracked area 32 on the second reference image 20' (as shown in FIG. 7B). Here, drawing software (such as Photoshop) or programming can be used to simulate various cracks on the second reference image 30. Then, step S180 is executed to input the simulated second reference image 30 into the training database 240. In summary, because the screen crack detection system 100 can accurately determine whether the mobile phone screen is broken and identify the number and range of the cracked area, it can reduce the cost of manual judgment and there is no need to worry about repairers who lie about the damage of the mobile phone. The problem. In addition, the screen crack detection system 100 of this case can not only be used to determine whether the mobile phone screen is damaged, but also can be used to determine whether the screen of other devices is damaged by changing the training data. The description of the present invention is as above, but it is not intended to limit the scope of patent rights claimed in this creation. The scope of its patent protection shall be determined by the attached scope of patent application and its equivalent fields. Anyone with ordinary knowledge in the field, without departing from the spirit or scope of this patent, makes changes or modifications that are equivalent changes or designs completed under the spirit of this creation, and should be included in the following patent scope Inside.

10:影像 12:第一特徵 14:目標框 40:影像輸入裝置 100:螢幕裂損檢測系統 110:輸入模組 120:特徵抽取模組 122:第一卷積式神經網路模型 122a:輸入層 122b:隱藏層 122c:輸出層 130:目標框選模組 132:區域建議網路模型 134:二元分類器 140:輸出模組 210:搜尋引擎 220:第二卷積式神經網路模型 230:非監督模型 240:訓練資料庫 30:第二參考影像 32:裂損區域 60:顯示螢幕 S110~S180:流程圖步驟10: Image 12: The first feature 14: target box 40: Video input device 100: Screen crack detection system 110: Input module 120: Feature Extraction Module 122: The first convolutional neural network model 122a: Input layer 122b: hidden layer 122c: output layer 130: Target frame selection module 132: Regional Recommendation Network Model 134: Binary classifier 140: output module 210: search engine 220: The second convolutional neural network model 230: unsupervised model 240: Training Database 30: Second reference image 32: Cracked area 60: display screen S110~S180: Flow chart steps

圖1所繪示為本發明之螢幕裂損檢測系統的實施例。 圖2A所繪示為影像之示意圖。 圖2B所繪示為在影像形成目標框之示意圖。 圖3所繪示為第一卷積式神經網路模型之示意圖。 圖4所繪示為資料收集時所用到的模型。 圖5所繪示為本發明之螢幕裂損檢測系統於訓練階段的資料收集的流程。 圖6所繪示為增加訓練資料庫中圖片的數量的流程。 圖7A所繪示為螢幕上不含有裂損區域的圖片之示意圖。 圖7B所繪示為在第二參考影像上摸擬出裂損區域之示意圖。 Fig. 1 shows an embodiment of the screen crack detection system of the present invention. Figure 2A shows a schematic diagram of the image. FIG. 2B shows a schematic diagram of forming a target frame in an image. Fig. 3 shows a schematic diagram of the first convolutional neural network model. Figure 4 shows the model used in data collection. FIG. 5 illustrates the data collection process of the screen crack detection system of the present invention in the training phase. Figure 6 shows the process of increasing the number of pictures in the training database. FIG. 7A shows a schematic diagram of a picture with no cracked area on the screen. FIG. 7B is a schematic diagram of simulating the cracked area on the second reference image.

10:影像 10: Image

12:第一特徵 12: The first feature

40:影像輸入裝置 40: Video input device

100:螢幕裂損檢測系統 100: Screen crack detection system

110:輸入模組 110: Input module

120:特徵抽取模組 120: Feature Extraction Module

122:第一卷積式神經網路模型 122: The first convolutional neural network model

130:目標框選模組 130: Target frame selection module

132:區域建議網路模型 132: Regional Recommendation Network Model

134:二元分類器 134: Binary classifier

140:輸出模組 140: output module

60:顯示螢幕 60: display screen

Claims (10)

一種螢幕裂損檢測系統,包括: 一輸入模組,適於接收一螢幕影像; 一特徵抽取模組,包括一第一卷積式神經網路模型,該第一卷積式神經網路模型將該螢幕影像轉換為一第一特徵;及 一目標框選模組,包括一區域建議網路模型,該區域建議網路模型接收該特徵抽取模組所傳送過來的該第一特徵,並辨識該螢幕影像中是否存在有至少一裂損區域,若存在該裂損區域則對該裂損區域進行框選以形成至少一目標框; 其中,該螢幕裂損檢測系統於訓練階段包括以下步驟: A10:以多個關鍵字從網際網路上搜尋並收集多個第一參考影像; A20:將上述第一參考影像傳送至一第二卷積式神經網路模型; A30:根據上述第一參考影像,該第二卷積式神經網路模型輸出多個第二特徵; A40:以一非監督模型並依據該第二特徵將上述第一參考影像分群;及 A50:將分群後且含有裂損區域的該第一參考影像輸入到一訓練資料庫中; A90:根據該訓練資料庫所儲存的該第一參考影像對該螢幕裂損檢測系統進行訓練。 A screen crack detection system, including: An input module suitable for receiving a screen image; A feature extraction module including a first convolutional neural network model that converts the screen image into a first feature; and A target frame selection module includes a regional suggestion network model, the regional suggestion network model receives the first feature sent by the feature extraction module, and recognizes whether there is at least one cracked area in the screen image , If the cracked area exists, frame the cracked area to form at least one target frame; Among them, the screen crack detection system includes the following steps in the training phase: A10: Search and collect multiple first reference images from the Internet with multiple keywords; A20: Send the above-mentioned first reference image to a second convolutional neural network model; A30: According to the above-mentioned first reference image, the second convolutional neural network model outputs a plurality of second features; A40: Use an unsupervised model to group the above-mentioned first reference images according to the second feature; and A50: Input the first reference image after grouping and containing the cracked area into a training database; A90: Train the screen crack detection system according to the first reference image stored in the training database. 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中,該第一卷積式神經網路模型與該第二卷積式神經網路模型為VGG模型、ResNet模型、DenseNet模型或Inception模型。The screen crack detection system described in the first item of the scope of patent application, wherein the first convolutional neural network model and the second convolutional neural network model are VGG model, ResNet model, DenseNet model or Inception model. 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中,該區域建議網路模型為faster-RCNN模型、YOLO模型、CTPN模型或EAST模型。Such as the screen crack detection system described in item 1 of the scope of patent application, wherein the proposed network model for this area is the faster-RCNN model, the YOLO model, the CTPN model or the EAST model. 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中該目標框選模組還包括一二元分類器且透過該二元分類器偵測該螢幕影像中是否存在有該裂損區域。For the screen crack detection system described in item 1 of the scope of patent application, the target frame selection module further includes a binary classifier and detects whether there is the cracked area in the screen image through the binary classifier . 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中該螢幕裂損檢測系統於訓練階段還包括以下步驟: A60:選出於螢幕上不含有裂損區域的多個第二參考影像;及 A70:在上述各個第二參考影像上摸擬出至少一裂損區域; A80:將摸擬後的該第二參考影像輸入到該訓練資料庫; 其中,於步驟A90中,是根據該訓練資料庫所儲存的該第一參考影像與該第二參考影像對該螢幕裂損檢測系統進行訓練。 For the screen crack detection system described in item 1 of the scope of patent application, the screen crack detection system further includes the following steps in the training phase: A60: Select multiple second reference images that do not contain cracked areas on the screen; and A70: Simulate at least one cracked area on each of the above-mentioned second reference images; A80: Input the simulated second reference image into the training database; Wherein, in step A90, the screen crack detection system is trained based on the first reference image and the second reference image stored in the training database. 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中於步驟A10中,該關鍵字包括破損或裂損。For the screen crack detection system described in item 1 of the scope of patent application, in step A10, the keyword includes damage or crack. 如申請專利範圍第6項所述之螢幕裂損檢測系統,其中於步驟A10中,該關鍵字還包括螢幕或手機。Such as the screen crack detection system described in item 6 of the scope of patent application, wherein in step A10, the keyword also includes screen or mobile phone. 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中該第一特徵為一二維矩陣,該二維矩陣主要是由該第一卷積式神經網路模型的部分隱藏層所構成。The screen crack detection system described in item 1 of the scope of patent application, wherein the first feature is a two-dimensional matrix, and the two-dimensional matrix is mainly composed of a part of the hidden layer of the first convolutional neural network model . 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中該第二特徵為一向量。In the screen crack detection system described in item 1 of the scope of patent application, the second feature is a vector. 如申請專利範圍第1項所述之螢幕裂損檢測系統,其中於步驟A40中,該非監督模型是執行K-means、DBSCAN、或Expectation Maximization演算法,以對上述第一參考影像分群。In the screen crack detection system described in the first item of the scope of patent application, in step A40, the unsupervised model executes K-means, DBSCAN, or Expectation Maximization algorithms to group the first reference image.
TW108124062A 2019-07-09 2019-07-09 Screen crack detection system TWI703511B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW108124062A TWI703511B (en) 2019-07-09 2019-07-09 Screen crack detection system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW108124062A TWI703511B (en) 2019-07-09 2019-07-09 Screen crack detection system

Publications (2)

Publication Number Publication Date
TWI703511B true TWI703511B (en) 2020-09-01
TW202103052A TW202103052A (en) 2021-01-16

Family

ID=73644057

Family Applications (1)

Application Number Title Priority Date Filing Date
TW108124062A TWI703511B (en) 2019-07-09 2019-07-09 Screen crack detection system

Country Status (1)

Country Link
TW (1) TWI703511B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW588287B (en) * 2002-09-16 2004-05-21 Chunghwa Telecom Co Ltd Image type container damage detector
US20050222705A1 (en) * 2004-04-03 2005-10-06 Budd Gerald W Automatic detection system for broken tools in CNC machining centers using advanced machine vision techniques
US20100158347A1 (en) * 2008-12-18 2010-06-24 Beijing Boe Optoelectronics Technology Co., Ltd. Method for detecting the line broken fault of common electrode lines of lcd
CN105353902A (en) * 2015-09-29 2016-02-24 赵跃 Electronic equipment and broken screen detection circuit thereof, detection method therefor and detection apparatus thereof
CN106228548A (en) * 2016-07-18 2016-12-14 图麟信息科技(上海)有限公司 The detection method of a kind of screen slight crack and device
CN108053328A (en) * 2017-12-13 2018-05-18 广州市景心科技股份有限公司 A kind of calling number is to the detection method of broken screen danger business demand
TW201837786A (en) * 2017-04-11 2018-10-16 香港商阿里巴巴集團服務有限公司 Image-based vehicle maintenance plan
CN109285079A (en) * 2018-08-31 2019-01-29 阿里巴巴集团控股有限公司 Data processing method, device, client and the server of terminal screen insurance
JP2019066267A (en) * 2017-09-29 2019-04-25 清水建設株式会社 Crack detector, crack detection method, and computer program
TWM588826U (en) * 2019-07-09 2020-01-01 洽吧智能股份有限公司 Screen crack detection system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW588287B (en) * 2002-09-16 2004-05-21 Chunghwa Telecom Co Ltd Image type container damage detector
US20050222705A1 (en) * 2004-04-03 2005-10-06 Budd Gerald W Automatic detection system for broken tools in CNC machining centers using advanced machine vision techniques
US20100158347A1 (en) * 2008-12-18 2010-06-24 Beijing Boe Optoelectronics Technology Co., Ltd. Method for detecting the line broken fault of common electrode lines of lcd
CN105353902A (en) * 2015-09-29 2016-02-24 赵跃 Electronic equipment and broken screen detection circuit thereof, detection method therefor and detection apparatus thereof
CN106228548A (en) * 2016-07-18 2016-12-14 图麟信息科技(上海)有限公司 The detection method of a kind of screen slight crack and device
TW201837786A (en) * 2017-04-11 2018-10-16 香港商阿里巴巴集團服務有限公司 Image-based vehicle maintenance plan
JP2019066267A (en) * 2017-09-29 2019-04-25 清水建設株式会社 Crack detector, crack detection method, and computer program
CN108053328A (en) * 2017-12-13 2018-05-18 广州市景心科技股份有限公司 A kind of calling number is to the detection method of broken screen danger business demand
CN109285079A (en) * 2018-08-31 2019-01-29 阿里巴巴集团控股有限公司 Data processing method, device, client and the server of terminal screen insurance
TWM588826U (en) * 2019-07-09 2020-01-01 洽吧智能股份有限公司 Screen crack detection system

Also Published As

Publication number Publication date
TW202103052A (en) 2021-01-16

Similar Documents

Publication Publication Date Title
US12061989B2 (en) Machine learning artificial intelligence system for identifying vehicles
US11663663B2 (en) Image analysis and identification using machine learning with output estimation
Prashnani et al. Pieapp: Perceptual image-error assessment through pairwise preference
CN104715023B (en) Method of Commodity Recommendation based on video content and system
CN105426870B (en) A kind of face key independent positioning method and device
US9501724B1 (en) Font recognition and font similarity learning using a deep neural network
CN109284733B (en) Shopping guide negative behavior monitoring method based on yolo and multitask convolutional neural network
WO2022037541A1 (en) Image processing model training method and apparatus, device, and storage medium
CN111160102B (en) Training method of face anti-counterfeiting recognition model, face anti-counterfeiting recognition method and device
US10319007B1 (en) Database image matching using machine learning with output estimation
US10783580B2 (en) Image analysis and identification using machine learning with output personalization
CN111222433B (en) Automatic face auditing method, system, equipment and readable storage medium
US10325315B1 (en) Database image matching using machine learning with output personalization
US11966829B2 (en) Convolutional artificial neural network based recognition system in which registration, search, and reproduction of image and video are divided between and performed by mobile device and server
US20210117984A1 (en) Shoe authentication device and authentication process
CN108323209B (en) Information processing method, system, cloud processing device and computer storage medium
CN109740539B (en) 3D object identification method based on ultralimit learning machine and fusion convolution network
CN112132776A (en) Visual inspection method and system based on federal learning, storage medium and equipment
CN115115552B (en) Image correction model training method, image correction device and computer equipment
CN110188828A (en) A kind of image sources discrimination method based on virtual sample integrated study
TWI703511B (en) Screen crack detection system
TWM588826U (en) Screen crack detection system
CN114239667A (en) Garbage detection classification and inference method based on deep learning
US7519237B2 (en) Method for characterizing stored information
CA3036260A1 (en) Database image matching using machine learning with output personalization

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees