TWM578421U - Character image recognition system - Google Patents

Character image recognition system Download PDF

Info

Publication number
TWM578421U
TWM578421U TW107217271U TW107217271U TWM578421U TW M578421 U TWM578421 U TW M578421U TW 107217271 U TW107217271 U TW 107217271U TW 107217271 U TW107217271 U TW 107217271U TW M578421 U TWM578421 U TW M578421U
Authority
TW
Taiwan
Prior art keywords
character
network model
neural network
type
training
Prior art date
Application number
TW107217271U
Other languages
Chinese (zh)
Inventor
趙式隆
林奕辰
沈昇勳
王彥稀
Original Assignee
洽吧智能股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 洽吧智能股份有限公司 filed Critical 洽吧智能股份有限公司
Priority to TW107217271U priority Critical patent/TWM578421U/en
Publication of TWM578421U publication Critical patent/TWM578421U/en

Links

Landscapes

  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

本新型之文件的字元影像識別系統,用以識別一待識別文件中的字元,該待識別文件包括多個字元影像,該字元影像識別系統包括一字元分割區域識別模組、一文意分析模組、一字元分割區域標籤模組及一輸出模組。一字元分割區域識別模組,辨識出至少一字元分割區域,該字元分割區域包括至少一上述字元影像。一文意分析模組,通信連結該字元分割區域識別模組,該文意分析模組將該字元分割區域中的該字元影像轉換為一可編輯字元,並對該可編輯字元進行校對。一字元分割區域標籤模組,該字元分割區域標籤模組識別出該字元分割區域於該待識別文件中的相對位置,並對該字元分割區域進行標籤。一輸出模組,輸出該可編輯字元、該字元分割區域於該待識別文件中的相對位置、與該字元分割區域所對應的標籤。The character image recognition system of the file of the present invention is configured to identify a character in a file to be recognized, the file to be identified includes a plurality of character images, and the character image recognition system comprises a character segmentation area recognition module, A textual analysis module, a character segmentation area label module and an output module. A character segmentation area recognition module identifies at least one character segmentation region, the character segmentation region including at least one of the above-described character images. a text analysis module, communicably connecting the character segmentation area identification module, wherein the text analysis module converts the character image in the character segmentation area into an editable character, and the editable character Proofreading. A character division area label module, the character division area label module identifies a relative position of the character division area in the to-be-identified file, and labels the character division area. An output module outputs the editable character, a relative position of the character segmentation area in the to-be-identified file, and a label corresponding to the character segmentation area.

Description

字元影像識別系統Character image recognition system

本新型是指一種識別系統,特別是指一種字元影像識別系統。The present invention refers to an identification system, and more particularly to a character image recognition system.

目前,保險公司為了有效減少紙本診斷書或相關單據輸入時發生錯誤及提高輸入的效率,在輸入該診斷書或該相關單據的作業過程中會使用OCR(Optical Character Recognition,光學字元識別)技術,以自動識別出該診斷書或該相關單據的內部影像到對應的輸入欄位中。 然而,現今OCR軟體通常需要搭配指定的影像掃描器才能確保字元辨識的精準度。此外,當該診斷書或該相關單據的字元模糊不清或是被髒污附著時,現有的OCR技術便無法正確辨識該字元。這樣一來,便需要耗費人力再次進行校驗及辨識。 因此,如何提高OCR技術識別字元影像的精準度,便是值得本領域具有通常知識者去思量地。At present, in order to effectively reduce the errors in the input of the paper diagnosis book or related documents and improve the efficiency of input, the insurance company uses OCR (Optical Character Recognition) in the operation of inputting the diagnosis or the related documents. Technique to automatically identify the internal image of the diagnostic book or the related document into the corresponding input field. However, today's OCR software usually needs to be matched with a specified image scanner to ensure the accuracy of character recognition. In addition, when the word of the diagnosis or the related document is illegible or dirty, the existing OCR technology cannot correctly recognize the character. In this way, it takes a lot of manpower to perform verification and identification again. Therefore, how to improve the accuracy of the OCR technology to recognize character images is worthy of consideration by those who have common knowledge in the field.

本新型之一種字元影像識別系統,用以識別一待識別文件中的字元,該待識別文件包括多個字元影像,該字元影像識別系統包括一字元分割區域識別模組、一文意分析模組、一字元分割區域標籤模組及一輸出模組。其中,一字元分割區域識別模組,辨識出至少一字元分割區域,該字元分割區域包括至少一上述字元影像。此外,一文意分析模組,通信連結該字元分割區域識別模組,該文意分析模組將該字元分割區域中的該字元影像轉換為一可編輯字元,並對該可編輯字元進行校對。而一字元分割區域標籤模組,該字元分割區域標籤模組識別出該字元分割區域於該待識別文件中的相對位置,並對該字元分割區域進行標籤,及一輸出模組,輸出該可編輯字元、該字元分割區域於該待識別文件中的相對位置、與該字元分割區域所對應的標籤。 如上述之字元影像識別系統,其中該字元影像識別系統還包括一伺服端與一客戶端,該客戶端具有一顯示螢幕,其中該字元分割區域識別模組、該文意分析模組、該字元分割區域標籤模組、與該輸出模組是設置於該伺服端,該輸出模組輸出該可編輯字元至該客戶端並顯示於該顯示螢幕上。 為讓本之上述特徵和優點能更明顯易懂,下文特舉較佳實施例,並配合所附圖式,作詳細說明如下。A character image recognition system for identifying a character in a file to be identified, the file to be identified includes a plurality of character images, the character image recognition system includes a character segmentation area recognition module, and a text The analysis module, the one-word segmentation area label module and an output module. The one-character segmentation area identifying module identifies at least one character segmentation area, and the character segmentation area includes at least one of the character image. In addition, a text analysis module communicatively connects the character segmentation area identification module, and the textual analysis module converts the character image in the character segmentation area into an editable character, and edits the character Characters are proofread. And a character segmentation area label module, the character division area label module identifies a relative position of the character division area in the to-be-identified file, and labels the character division area, and an output module And outputting the editable character, the relative position of the character segmentation area in the to-be-identified file, and the label corresponding to the character segmentation area. For example, the character image recognition system further includes a server end and a client, the client having a display screen, wherein the character segmentation area identification module and the text analysis module The character segmentation area label module and the output module are disposed on the servo end, and the output module outputs the editable character to the client and displayed on the display screen. The above described features and advantages will be more apparent from the following description.

參照本文闡述的詳細內容和附圖說明是最好理解本新型。下面參照附圖會討論各種實施例。然而,本領域技術人員將容易理解,這裡關於附圖給出的詳細描述僅僅是為了解釋的目的,因為這些方法和系統可超出所描述的實施例。例如,所給出的教導和特定應用的需求可能產生多種可選的和合適的方法來實現在此描述的任何細節的功能。因此,任何方法可延伸超出所描述和示出的以下實施例中的特定實施選擇範圍。 在說明書及後續的申請專利範圍當中使用了某些詞彙來指稱特定的元件。所屬領域中具有通常知識者應可理解,硬體製造商可能會用不同的名詞來稱呼同樣的元件。本說明書及後續的申請專利範圍並不以名稱的差異來作為區分元件的方式,而是以元件在功能上的差異來作為區分的準則。在通篇說明書及後續的請求項當中所提及的「包含」係為一開放式的用語,故應解釋成「包含但不限定於」。另外,「耦接」一詞在此係包含任何直接及間接的電氣連接手段。因此,若文中描述一第一裝置耦接於一第二裝置,則代表該第一裝置可直接電氣連接於該第二裝置,或透過其他裝置或連接手段間接地電氣連接至該第二裝置。 請參閱圖1,圖1所繪示為本新型之字元影像識別系統的實施例。字元影像識別系統100包括一字元分割區域識別模組110、一文意分析模組120、一字元分割區域標籤模組130與一輸出模組140,其中字元影像識別系統100還電性連接到一影像輸入裝置10,此影像輸入裝置10例如為一掃描裝置或一數位相機。藉由此影像輸入裝置10,可將一待識別文件(如圖2A)匯入到字元影像識別系統100中。在本實施例中,字元分割區域識別模組110、文意分析模組120、字元分割區域標籤模組130與輸出模組140是設置於伺服端102,該伺服端102例如是由一台或多台伺服器所組成。另外,輸出模組140是電性連接到一客戶端104,客戶端104可為一具有顯示螢幕104a的電子裝置,此電子裝置例如為個人電腦、筆記型電腦、或智慧型手機。 另外,也請參照圖2A,圖2A所繪示為待識別文件的其中一實施例,在本實施例中待識別文件為診斷證明書。從圖2A可知,此待識別文件包括多個字元,而當待識別文件的影像被影像輸入裝置10捕捉後,待識別文件上的字元當然也是以影像的方式存在的,也就是說由影像輸入裝置10匯入到字元影像識別系統100的待識別文件上的字元是無法編輯的,以下將這些字元稱為字元影像。 此外,請同時參照圖3,圖3所繪示為本新型之字元影像識別方法的實施例。首先,實施步驟S210,匯入如圖2A的待識別文件,其詳細流程已如上文所述,在此不再贅述。接著,實施步驟S220,辨識出待識別文件中的字元分割區域81。在圖2B中,字元分割區域81是由虛線所框出來的區域,字元分割區域81例如是由字元分割區域識別模組110識別出來。由圖2B可清楚得知,字元分割區域81是將待識別文件上的字元影像選取出來。在較佳的實施例中,實施步驟S220後還會實施步驟S222,將字元分割區域81分成一需求字元集與一非需求字元集。需求字元集是指在後續的處理中,需要輸出的字元的集合,這些字元例如為圖2B中的元件符號81a所標示出來的區域。非需求字元集是指在後續的處理中,不需要輸出的字元的集合,這些字元例如為圖2B中的元件符號81b所標示出來的區域。更詳細的舉例,在圖2B中,印章『以下空白』由於對於之後的資料處理可能無關輕重,故被歸類為非需求字元集。 之後,實施步驟S230,藉由文意分析模組120將字元分割區域81中的字元影像轉換為可編輯字元。也就是說,原本由影像輸入裝置10所匯入的待識別文件的影像,其上的字元影像是無法編輯的;然而文意分析模組120可將這些字元影像轉換為可編輯字元,其例如是採用OCR(Optical Character Recognition,光學字元識別)的技術。然而,若單純採用OCR的技術,在待識別文件上的字元影像模糊不清或是被髒污附著時,便可能發生判別錯誤的情形。舉例來說,原本『雙和醫院』因為被髒污附著而產生如圖4上方所示的字元,若單純只採用OCR的技術,便可能將『雙和醫院』辨識為『雙利醫院』(如圖4中央所示)。然而,在本實施例中,文意分析模組120可執行步驟S240,對所轉換出的可編輯字元進行校正,例如將『雙利醫院』校正成『雙和醫院』(如圖4下方所示)。 之後,實施步驟S250,可藉由字元分割區域標籤模組130辨識出各個字元分割區域81於待識別文件中的相對位置,並對字元分割區域81進行標籤。舉例來說,在圖2B的待識別文件中,字元分割區域標籤模組130可辨識出『姓名』這個欄位在待識別文件中的相對位置,同時也『姓名』這個欄位一個叫做『姓名』的標籤。 再來,實施步驟S260,藉由輸出模組140輸出上述的可編輯字元、字元分割區域81於待識別文件中的相對位置、與字元分割區域81所對應的標籤至客戶端104,這樣便可在客戶端104的顯示螢幕104a上顯示待識別文件上的字元。值得注意的是,由於輸出模組140還會輸出字元分割區域81於待識別文件中的相對位置,故在客戶端104的顯示螢幕104a上還會重現各字元於待識別文件上的位置。也就是說,若待識別文件為表格形式的文件(如圖2A所示),則在經過本新型之字元影像識別系統100的處理後,不但能呈現可編輯字元還能重現這些可編輯字元於表格中的位置,這是一般OCR軟體所無法達到的。 此外,由於字元分割區域81的標籤也會被輸出,故客戶端104更可對所接受的資料進行更進一步的處理,例如製作成Excel表或由資料庫管理系統對待識別文件中的資料進行整理和歸類。舉例來說,若藉由一般的OCR軟體來擷取診斷證明書的資料,即使準確的辨別各字元,使用者還是需用人工的方式來設定病人與病名的對應關係。然而,藉由字元影像識別系統100,便可自動地建立起病人與病名的對應關係,從而減少人力成本。 綜上,相較於一般的OCR軟體,本新型之字元影像識別系統100具有對辨識錯誤的字元進行校正、重現字元於待識別文件中的相對位置、與對字元分割區域81進行標籤以利於後續資料處理等優點。 在上述的實施例中,字元分割區域識別模組110例如是藉由一第一類神經網路模型112辨識出待識別文件中的字元分割區域81,第一類神經網路模型112例如是藉由深度學習的方式進行訓練,以期提高第一類神經網路模型112的精度,以下將介紹第一類神經網路模型112的訓練方式,還請參照圖5,圖5所繪示為第一類神經網路模型於訓練階段的流程圖。首先,實施步驟S310,針對某類別的待識別文件(例如圖2A所示的診斷證明書),取得一定數量N1的同類別的影像樣本(為了區分方便,在本文其他處又可將N1稱為第一數量)。接著,實施步驟S320,將收集到的這些影像樣本分成為一訓練集、一測試集、與一驗證集。接著,實施步驟S330,將訓練集作為一神經網路模型的訓練樣本,該神經網路模型完成訓練後便生成該第一類神經網路模型112的原型。 之後,實施步驟S340,利用測試集驗證第一類神經網路模型112的原型的正確性,若驗證通過率小於一第一預設閾值(例如90%),則實施步驟S350,若否則實施步驟S360。在此,所謂的通過是指第一類神經網路模型112可以正確辨識出字元分割區域81,在較佳的實施例中還可進一步將字元分割區域81分類為需求字元集與非需求字元集。 在步驟S350中,需再獲取再一定數量N2的同類別的影像樣本(為了區分方便,在本文其他處又可將N2稱為第二數量),並重複步驟S320~步驟S340。在步驟S360中,利用驗證集驗證第一類神經網路模型112的正確性,若驗證通過率小於一第二預設閾值(例如95%),則實施步驟S370,若否則實施步驟S380。 在步驟S370中,需再獲取又一定數量N3的同類別的影像樣本(為了區分方便,在本文其他處又可將N3稱為第三數量),並重複步驟S360。在步驟S380中,則完成第一類神經網路模型112的訓練,也就是說可以將第一類神經網路模型112投入實用中。 在上述的實施例中,文意分析模組120例如是藉由一第二類神經網路模型122對可編輯字元進行校正,例如如圖4所示將『雙利醫院』校正成『雙和醫院』。第二類神經網路模型122例如是藉由深度學習的方式進行訓練,以期提高第二類神經網路模型122的精度,以下將介紹第二類神經網路模型122的訓練方式,還請參照圖6,圖6所繪示為第二類神經網路模型於訓練階段的流程圖。首先,實施步驟S410,針對待識別文件中的某一字元分割區域81,獲取一定數量N4的該字元分割區域的可編輯字元樣本(為了區分方便,在本文其他處又可將N4稱為第四數量)。舉例來說,圖2B中的元件符號81a所標示出來的區域是指醫院名稱。在步驟S410中,就是輸入數量為第四數量N4的醫院名稱。 接著,實施步驟S420,將收集到的這些可編輯字元樣本分成為一訓練集、一測試集、與一驗證集。接著,實施步驟S430,將訓練集作為一神經網路模型的訓練樣本,該神經網路模型完成訓練後便生成第二類神經網路模型122的原型。之後,實施步驟S440,利用測試集驗證第二類神經網路模型122的原型的正確性,若驗證通過率小於一第一預設閾值(例如90%),則實施步驟S450,若否則實施步驟S460。在此,所謂的通過是指第二類神經網路模型122可以正確校正可編輯字元的錯誤。 在步驟S450中,需再獲取再一定數量N5的同類別的可編輯字元樣本(為了區分方便,在本文其他處又可將N5稱為第五數量),並重複步驟S420~步驟S440。在步驟S460中,利用驗證集驗證第二類神經網路模型122的正確性,若驗證通過率小於一第二預設閾值(例如95%),則實施步驟S470,若否則實施步驟S480。 在步驟S470中,需再獲取又一定數量N6的同類別的可編輯字元樣本(為了區分方便,在本文其他處又可將N6稱為第六數量),並重複步驟S460。在步驟S480中,則完成第二類神經網路模型122的訓練,也就是說可以將第二類神經網路模型122投入實用中。 在上述的實施例中,字元分割區域標籤模組130例如是藉由一第三類神經網路模型132辨識出各個字元分割區域81於待識別文件中的相對位置,並對字元分割區域81進行標籤。第三類神經網路模型132例如是藉由深度學習的方式進行訓練,以期提高第三類神經網路模型132的精度,以下將介紹第三類神經網路模型132的訓練方式,還請參照圖7,圖7所繪示為第三類神經網路模型於訓練階段的流程圖。首先,實施步驟S510,針對待識別文件中的字元分割區域81,獲取一第七數量N7的該字元分割區域81的標籤與在該待識別文件中的相對位置。字元分割區域81的標籤例如是由人工先進行標註後再輸入一神經網路模型,例如可以聘請多位資料標註師標註醫院名稱在待識別文件的哪邊,病人的姓名又在待識別文件的何處。關於如何標註,可以參考本案申請人的另外一篇專利申請案(申請號:107140893)。 接著,實施步驟S520,將收集到的這些字元分割區域81的標籤的樣本分成為一訓練集、一測試集、與一驗證集。接著,實施步驟S530,將訓練集作為一神經網路模型的訓練樣本,該神經網路模型完成訓練後便生成第三類神經網路模型132的原型。之後,實施步驟S540,利用測試集驗證第三類神經網路模型132的原型的正確性,若驗證通過率小於一第一預設閾值(例如90%),則實施步驟S550,若否則實施步驟S560。在此,所謂的通過是指第三類神經網路模型132可以正確辨識出各個字元分割區域81於待識別文件中的相對位置,並對字元分割區域81進行標籤。 在步驟S550中,需再獲取再一定數量N8的同類別的標籤樣本(為了區分方便,在本文其他處又可將N8稱為第八數量),並重複步驟S520~步驟S540。在步驟S560中,利用驗證集驗證第三類神經網路模型132的正確性,若驗證通過率小於一第二預設閾值(例如95%),則實施步驟S570,若否則實施步驟S580。 在步驟S570中,需再獲取又一定數量N9的同類別的標籤樣本(為了區分方便,在本文其他處又可將N9稱為第九數量),並重複步驟S560。在步驟S580中,則完成第三類神經網路模型132的訓練,也就是說可以將第三類神經網路模型132投入實用中。 在上述的實施例中,不管是第一類神經網路模型112、第二類神經網路模型122、或第三類神經網路模型132,於訓練時都是將樣本分成訓練集、測試集、與驗證集,先由訓練集訓練後,再由測試集進行測試,若通過則再由驗證集進行驗證。相較於習知的神經網路模型在訓練時,只將樣本分成訓練集與測試集,本案之第一類神經網路模型112、第二類神經網路模型122、與第三類神經網路模型132可獲得更高的正確率。此外,在上述的實施例中,第一類神經網路模型112、第二類神經網路模型122、或第三類神經網路模型132可以為可為遞歸神經網路(Recurrent Neural Network)、長短期記憶模型(Long Short-Term Memory)或是卷積神經網路(Convolutional Neural Network),請注意,此僅為本新型的實施例,並非本新型的限制條件。 雖然本新型已以較佳實施例揭露如上,然其並非用以限定本新型,任何所屬技術領域中具有通常知識者,在不脫離本新型之精神和範圍內,當可作些許之更動與潤飾,因此本新型之保護範圍當視後附之申請專利範圍所界定者為準。The present invention is best understood by reference to the detailed description and the accompanying drawings set forth herein. Various embodiments are discussed below with reference to the drawings. However, those skilled in the art will readily appreciate that the detailed description of the drawings herein is for the purpose of explanation and description For example, the teachings presented and the needs of a particular application may result in a variety of alternative and suitable methods for implementing the functionality of any of the details described herein. Thus, any method may extend beyond the specific implementation selections in the following embodiments described and illustrated. Certain terms are used throughout the description and following claims to refer to particular elements. It should be understood by those of ordinary skill in the art that hardware manufacturers may refer to the same elements by different nouns. The scope of this specification and the subsequent patent application do not use the difference of the names as the means for distinguishing the elements, but the difference in function of the elements as the criterion for distinguishing. The term "including" as used throughout the specification and subsequent claims is an open term and should be interpreted as "including but not limited to". In addition, the term "coupled" is used herein to include any direct and indirect electrical connection. Therefore, if a first device is coupled to a second device, it means that the first device can be directly electrically connected to the second device or indirectly electrically connected to the second device through other devices or connection means. Please refer to FIG. 1. FIG. 1 illustrates an embodiment of the novel character image recognition system. The character image recognition system 100 includes a character segmentation area recognition module 110, a text analysis module 120, a character segmentation area label module 130, and an output module 140. The character image recognition system 100 is also electrically Connected to an image input device 10, such as a scanning device or a digital camera. By means of the image input device 10, a file to be recognized (Fig. 2A) can be imported into the character image recognition system 100. In this embodiment, the character segmentation area identification module 110, the semantic analysis module 120, the character segmentation area label module 130, and the output module 140 are disposed on the server end 102, and the server end 102 is, for example, a One or more servers. In addition, the output module 140 is electrically connected to a client 104. The client 104 can be an electronic device having a display screen 104a, such as a personal computer, a notebook computer, or a smart phone. In addition, please refer to FIG. 2A , which illustrates one embodiment of the file to be identified. In this embodiment, the file to be identified is a diagnostic certificate. As can be seen from FIG. 2A, the file to be identified includes a plurality of characters, and when the image of the file to be recognized is captured by the image input device 10, the characters on the file to be recognized are of course also imaged, that is, The characters that the image input device 10 imports into the file to be recognized of the character image recognition system 100 cannot be edited, and these characters are hereinafter referred to as character images. In addition, please refer to FIG. 3 at the same time, and FIG. 3 illustrates an embodiment of the present character image recognition method. First, the step S210 is implemented to import the file to be identified as shown in FIG. 2A, and the detailed process thereof is as described above, and details are not described herein again. Next, step S220 is implemented to recognize the character segmentation area 81 in the file to be identified. In FIG. 2B, the character division area 81 is an area framed by a broken line, and the character division area 81 is recognized by, for example, the character division area recognition module 110. As is clear from Fig. 2B, the character segmentation area 81 selects the character image on the file to be identified. In a preferred embodiment, after step S220 is implemented, step S222 is further implemented to divide the character segmentation area 81 into a set of demand words and a set of non-demand words. The set of required characters refers to a set of characters that need to be output in subsequent processing, such as the area indicated by the component symbol 81a in Fig. 2B. The non-required character set refers to a set of characters that need not be output in subsequent processing, such as the area indicated by the component symbol 81b in Fig. 2B. In more detail, in FIG. 2B, the stamp "below blank" is classified as a non-required character set because it may be irrelevant for subsequent data processing. Thereafter, in step S230, the character image in the character segmentation area 81 is converted into an editable character by the semantic analysis module 120. That is to say, the image of the image to be recognized originally imported by the image input device 10 is uneditable; however, the semantic analysis module 120 can convert the image of the character into an editable character. For example, it is a technique using OCR (Optical Character Recognition). However, if the technique of OCR is simply adopted, when the character image on the document to be identified is blurred or dirty, a discriminating error may occur. For example, the original "Double and Hospital" was created with the characters shown in the top of Figure 4 because it was contaminated. If you only use OCR technology, you may identify "Double Hospital" as "Double Hospital". (As shown in the center of Figure 4). However, in this embodiment, the semantic analysis module 120 may perform step S240 to correct the converted editable character, for example, correcting "Shuangli Hospital" into "Double and Hospital" (as shown in FIG. 4). Shown). Then, in step S250, the character segmentation area tag module 130 can recognize the relative position of each character segmentation area 81 in the file to be identified, and label the character segmentation area 81. For example, in the file to be identified in FIG. 2B, the character segmentation area label module 130 can recognize the relative position of the "name" field in the file to be identified, and also the field name "name" is called " The name of the label. Then, in step S260, the output module 140 outputs the editable character, the relative position of the character segmentation area 81 in the file to be identified, and the label corresponding to the character segmentation area 81 to the client 104. Thus, the characters on the file to be identified can be displayed on the display screen 104a of the client 104. It should be noted that since the output module 140 also outputs the relative position of the character segmentation area 81 in the file to be identified, each character on the display screen 104a of the client 104 is also reproduced on the file to be identified. position. That is to say, if the file to be identified is a file in the form of a table (as shown in FIG. 2A), after being processed by the character image recognition system 100 of the present invention, not only the editable characters but also the characters can be reproduced. Edit the position of the character in the table, which is not possible with the general OCR software. In addition, since the label of the character segmentation area 81 is also output, the client 104 can further process the received data, for example, into an Excel table or by the database management system to identify the data in the file. Organize and categorize. For example, if the data of the diagnostic certificate is obtained by the general OCR software, even if the characters are accurately distinguished, the user needs to manually set the correspondence between the patient and the disease name. However, with the character image recognition system 100, the correspondence between the patient and the disease name can be automatically established, thereby reducing labor costs. In summary, compared with the general OCR software, the character image recognition system 100 of the present invention has the function of correcting the characters that are recognized incorrectly, reproducing the relative positions of the characters in the file to be identified, and the segmentation area 81 of the pair of characters. Labeling to facilitate subsequent data processing and other advantages. In the above embodiment, the character segmentation area recognition module 110 recognizes, for example, the character segmentation area 81 in the file to be identified by a first type of neural network model 112, for example, the first type of neural network model 112. The training is performed by means of deep learning in order to improve the accuracy of the first type of neural network model 112. The training method of the first type of neural network model 112 will be described below. Please also refer to FIG. 5, which is illustrated in FIG. Flow chart of the first type of neural network model in the training phase. First, in step S310, for a certain category of files to be identified (for example, the diagnostic certificate shown in FIG. 2A), a certain number of N1 image samples of the same category are obtained (for convenience of distinction, N1 may be referred to elsewhere in this document). The first number). Then, step S320 is implemented to divide the collected image samples into a training set, a test set, and a verification set. Next, step S330 is implemented to use the training set as a training sample of a neural network model, and the neural network model is trained to generate a prototype of the first type of neural network model 112. Then, step S340 is performed to verify the correctness of the prototype of the first type of neural network model 112 by using the test set. If the verification pass rate is less than a first preset threshold (for example, 90%), step S350 is performed, if otherwise, the steps are implemented. S360. Here, the so-called pass means that the first type of neural network model 112 can correctly recognize the character segmentation area 81. In a preferred embodiment, the character segmentation area 81 can be further classified into a set of required character sets. Demand character set. In step S350, a certain number of image samples of the same category of N2 need to be acquired again (for convenience of distinction, N2 may be referred to as a second number elsewhere in the text), and steps S320 to S340 are repeated. In step S360, the correctness of the first type of neural network model 112 is verified by using the verification set. If the verification pass rate is less than a second predetermined threshold (for example, 95%), step S370 is performed, otherwise step S380 is implemented. In step S370, a certain number of N3 image samples of the same category are acquired again (for convenience of distinction, N3 may be referred to as a third number elsewhere in the text), and step S360 is repeated. In step S380, the training of the first type of neural network model 112 is completed, that is, the first type of neural network model 112 can be put into practical use. In the above embodiment, the semantic analysis module 120 corrects the editable characters by, for example, a second type of neural network model 122. For example, the "Shuangli Hospital" is corrected to "double" as shown in FIG. And hospital." The second type of neural network model 122 is trained, for example, by means of deep learning, in order to improve the accuracy of the second type of neural network model 122. The training method of the second type of neural network model 122 will be described below, also refer to FIG. 6 and FIG. 6 are flowcharts showing a second type of neural network model in a training phase. First, in step S410, for a certain character segmentation area 81 in the file to be identified, a certain number of editable character samples of the character segmentation area of N4 are obtained (for convenience of distinction, N4 may be referred to elsewhere in this document) For the fourth quantity). For example, the area indicated by the symbol 81a in Fig. 2B refers to the hospital name. In step S410, the hospital name of the fourth number N4 is input. Next, step S420 is implemented to divide the collected editable character samples into a training set, a test set, and a verification set. Next, step S430 is implemented to use the training set as a training sample of a neural network model, and the neural network model is trained to generate a prototype of the second type neural network model 122. Then, step S440 is performed to verify the correctness of the prototype of the second type of neural network model 122 by using the test set. If the verification pass rate is less than a first preset threshold (for example, 90%), step S450 is performed, if otherwise, the steps are implemented. S460. Here, the so-called pass means that the second type of neural network model 122 can correctly correct the error of the editable character. In step S450, a certain number N5 of editable character samples of the same category are acquired again (for convenience of distinction, N5 may be referred to as a fifth number elsewhere in the text), and steps S420 to S440 are repeated. In step S460, the correctness of the second type of neural network model 122 is verified by using the verification set. If the verification pass rate is less than a second predetermined threshold (for example, 95%), step S470 is performed, otherwise step S480 is implemented. In step S470, a certain number of N6 editable character samples of the same category are acquired again (for convenience of distinction, N6 may be referred to as a sixth number elsewhere in the text), and step S460 is repeated. In step S480, the training of the second type of neural network model 122 is completed, that is, the second type of neural network model 122 can be put into practical use. In the above embodiment, the character segmentation area label module 130 recognizes the relative position of each character segmentation area 81 in the file to be identified, for example, by a third type of neural network model 132, and segmentes the characters. Area 81 is tagged. The third type of neural network model 132 is trained, for example, by means of deep learning, in order to improve the accuracy of the third type of neural network model 132. The training method of the third type of neural network model 132 will be described below, and also refers to FIG. 7 and FIG. 7 are flowcharts showing a third type of neural network model in a training phase. First, in step S510, for a character segmentation area 81 in the file to be identified, a label of the character segmentation area 81 of a seventh number N7 and a relative position in the file to be identified are acquired. The tag of the character segmentation area 81 is, for example, manually labeled first and then input into a neural network model. For example, a plurality of data taggers may be hired to indicate which side of the file to be identified, and the patient's name is in the file to be identified. Where? For how to mark, you can refer to another patent application (application number: 107140893) of the applicant. Next, step S520 is implemented to divide the collected samples of the labels of the character segmentation areas 81 into a training set, a test set, and a verification set. Next, step S530 is implemented to use the training set as a training sample of a neural network model, and the neural network model completes the training to generate a prototype of the third type neural network model 132. Then, step S540 is implemented to verify the correctness of the prototype of the third type neural network model 132 by using the test set. If the verification pass rate is less than a first preset threshold (for example, 90%), step S550 is performed, if otherwise, the steps are implemented. S560. Here, the so-called pass means that the third type of neural network model 132 can correctly recognize the relative position of each character segmentation area 81 in the file to be identified, and label the character division area 81. In step S550, a certain number of N8 tag samples of the same category are acquired again (for convenience of distinction, N8 may be referred to as an eighth number elsewhere in the text), and steps S520 to S540 are repeated. In step S560, the correctness of the third type of neural network model 132 is verified by using the verification set. If the verification pass rate is less than a second predetermined threshold (for example, 95%), step S570 is performed, otherwise step S580 is implemented. In step S570, a certain number of N9 tag samples of the same category are acquired again (for convenience of distinction, N9 may be referred to as a ninth number elsewhere in the text), and step S560 is repeated. In step S580, the training of the third type of neural network model 132 is completed, that is, the third type of neural network model 132 can be put into practical use. In the above embodiment, whether it is the first type of neural network model 112, the second type of neural network model 122, or the third type of neural network model 132, the samples are divided into training sets and test sets during training. And the verification set, first trained by the training set, and then tested by the test set, and if passed, verified by the verification set. Compared with the conventional neural network model, only the samples are divided into a training set and a test set during training. The first type of neural network model 112, the second type of neural network model 122, and the third type of neural network are the case. The road model 132 can achieve a higher accuracy rate. In addition, in the foregoing embodiment, the first type of neural network model 112, the second type of neural network model 122, or the third type of neural network model 132 may be a Recurrent Neural Network, Long Short-Term Memory or Convolutional Neural Network, please note that this is only an embodiment of the present invention and is not a limitation of the present invention. Although the present invention has been disclosed in the above preferred embodiments, it is not intended to limit the present invention, and any person skilled in the art can make some modifications and refinements without departing from the spirit and scope of the present invention. Therefore, the scope of protection of this new type is subject to the definition of the scope of the patent application.

10‧‧‧影像輸入裝置10‧‧‧Image input device

100‧‧‧字元影像識別系統 100‧‧‧ character image recognition system

110‧‧‧字元分割區域識別模組 110‧‧‧ character segmentation area recognition module

112‧‧‧第一類神經網路模型 112‧‧‧First type of neural network model

120‧‧‧文意分析模組 120‧‧‧literal analysis module

122‧‧‧第二類神經網路模型 122‧‧‧Second type neural network model

130‧‧‧字元分割區域標籤模組 130‧‧‧Character segmentation area label module

132‧‧‧第三類神經網路模型 132‧‧‧The third type of neural network model

140‧‧‧輸出模組 140‧‧‧Output module

102‧‧‧伺服端 102‧‧‧Server

104‧‧‧客戶端 104‧‧‧Client

104a‧‧‧電子裝置 104a‧‧‧Electronic device

81‧‧‧字元分割區域 81‧‧‧ character segmentation area

81a‧‧‧元件符號 81a‧‧‧Component symbol

81b‧‧‧元件符號 81b‧‧‧Component symbol

S210~S260‧‧‧字元影像識別方法的實施步驟 S210~S260‧‧‧ character image recognition method implementation steps

S310~S380‧‧‧第一類神經網路模型於訓練階段的實施步驟 S310~S380‧‧‧ Implementation steps of the first type of neural network model in the training phase

S410~S480‧‧‧第二類神經網路模型於訓練階段的實施步驟 S410~S480‧‧‧ implementation steps of the second type of neural network model in the training phase

S510~S580‧‧‧第三類神經網路模型於訓練階段的實施步驟 S510~S580‧‧‧ implementation steps of the third type of neural network model in the training phase

下文將根據附圖來描述各種實施例,所述附圖是用來說明而不是用以任何方式來限制範圍,其中相似的標號表示相似的組件,並且其中: 圖1所繪示為本新型之字元影像識別系統的實施例。 圖2A與圖2B所繪示為待識別文件的其中一實施例。 圖3所繪示為新型之字元影像識別方法的實施例。 圖4所繪示為文意分析模組進行校正過程的實施方式。 圖5所繪示為第一類神經網路模型於訓練階段的流程圖。 圖6所繪示為第二類神經網路模型於訓練階段的流程圖。 圖7所繪示為第三類神經網路模型於訓練階段的流程圖。The various embodiments are described below with reference to the accompanying drawings, in which An embodiment of a character image recognition system. 2A and 2B illustrate one embodiment of a file to be identified. FIG. 3 illustrates an embodiment of a novel character image recognition method. FIG. 4 illustrates an embodiment of a calibration process performed by the semantic analysis module. FIG. 5 is a flow chart of the first type of neural network model in the training phase. FIG. 6 is a flow chart of the second type of neural network model in the training phase. Figure 7 is a flow chart showing the third type of neural network model in the training phase.

Claims (8)

一種字元影像識別系統,用以識別一待識別文件中的字元,該待識別文件包括多個字元影像,該字元影像識別系統包括: 一伺服端,包括: 一字元分割區域識別模組,辨識出至少一字元分割區域,該字元分割區域包括至少一上述字元影像; 一文意分析模組,通信連結該字元分割區域識別模組,該文意分析模組將該字元分割區域中的該字元影像轉換為一可編輯字元,並對該可編輯字元進行校對; 一字元分割區域標籤模組,該字元分割區域標籤模組識別出該字元分割區域於該待識別文件中的相對位置,並對該字元分割區域進行標籤;及 一輸出模組,輸出該可編輯字元、該字元分割區域於該待識別文件中的相對位置、與該字元分割區域所對應的標籤;及 一客戶端,具有一顯示螢幕; 其中,該輸出模組輸出該可編輯字元至該客戶端並顯示於該顯示螢幕上。A character image recognition system for identifying a character in a file to be identified, the file to be identified includes a plurality of character images, the character image recognition system comprising: a server, comprising: a character segmentation area recognition The module identifies at least one character segmentation region, the character segmentation region includes at least one of the character image images; and a textual analysis module, communicatively connecting the character segmentation region identification module, and the textual analysis module Converting the character image in the character segmentation area into an editable character element and correcting the editable character element; a character segmentation area label module, the character segmentation area label module identifying the character element Dividing a relative position of the area in the to-be-identified file, and labeling the character division area; and an output module, outputting the editable character, a relative position of the character division area in the to-be-identified file, a label corresponding to the character segmentation area; and a client having a display screen; wherein the output module outputs the editable character to the client and displayed on the Displayed on the screen. 如申請專利範圍第1項所述之字元影像識別系統,其中該字元分割區域識別模組將該字元分割區域被分類為一需求字元集與一非需求字元集。The character image recognition system of claim 1, wherein the character segmentation area recognition module classifies the character segmentation region into a set of demand characters and a set of non-demand characters. 如申請專利範圍第1項或第2項所述之字元影像識別系統,其中該字元分割區域識別模組是藉由一第一類神經網路模型辨識出該字元分割區域。The character image recognition system according to claim 1 or 2, wherein the character segmentation region recognition module identifies the segmentation region by a first type of neural network model. 如申請專利範圍第3項所述之字元影像識別系統,其中該第一類神經網路模型於訓練階段時,採用下述的步驟進行訓練: (b1) 針對該待識別文件,獲取一第一數量的同類別的影像樣本; (b2) 將該些影像樣本分成為一訓練集、一測試集、與一驗證集; (b3) 利用該訓練集作為訓練樣本,以生成該第一類神經網路模型的原型; (b4) 利用該測試集驗證該第一類神經網路模型的原型的正確性,若該測試集的驗證通過率小於一第一預設閾值,則進入步驟(b5),若該測試集的驗證通過率不小於該第一預設閾值,則進入步驟(b6); (b5) 再獲取一第二數量的同類別的影像樣本,並重複(b2)~ (b4)的步驟; (b6) 利用該驗證集驗證該第一類神經網路模型的正確性,若該驗證集的驗證通過率小於一第二預設閾值,則進入步驟(b7),若該測試集的驗證通過率不小於該第二預設閾值,則進入步驟(b8); (b7) 再獲取一第三數量的同類別的影像樣本,並重複 (b6)的步驟; (b8) 完成該第一類神經網路模型的訓練。For example, the character image recognition system described in claim 3, wherein the first type of neural network model is used in the training phase, the following steps are used for training: (b1) for the file to be identified, obtain a first a quantity of image samples of the same category; (b2) dividing the image samples into a training set, a test set, and a verification set; (b3) using the training set as a training sample to generate the first type of nerve Prototype of the network model; (b4) verifying the correctness of the prototype of the first type of neural network model by using the test set, and if the verification pass rate of the test set is less than a first preset threshold, proceeding to step (b5) If the verification pass rate of the test set is not less than the first preset threshold, proceed to step (b6); (b5) obtain a second quantity of image samples of the same category, and repeat (b2)~(b4) Step (b6) verifying the correctness of the first type of neural network model by using the verification set, and if the verification pass rate of the verification set is less than a second preset threshold, proceeding to step (b7), if the test set The verification pass rate is not less than the second preset threshold, then Step (B7) acquiring image sample and then a third number of the same category, and repeating (b6) a;; the step (b8) (b8) to complete the training of the neural network model of the first type. 如申請專利範圍第1項所述之字元影像識別系統,其中該文意分析模組是藉由一第二類神經網路模型對該可編輯字元進行校對。The character image recognition system of claim 1, wherein the text analysis module corrects the editable character by a second type of neural network model. 如申請專利範圍第5項所述之字元影像識別系統,其中該第二類神經網路模型於訓練階段時,採用下述的步驟進行訓練: (d1) 針對該待識別文件中的該字元分割區域,獲取一第四數量的該字元分割區域的可編輯字元樣本; (d2) 將該些可編輯字元樣本分成為一訓練集、一測試集、與一驗證集; (d3) 利用該訓練集作為訓練樣本,以生成該第二類神經網路模型的原型; (d4) 利用該測試集驗證該第二類神經網路模型的正確性,若該測試集的驗證通過率小於一第四預設閾值,則進入步驟(d5),若該測試集的驗證通過率不小於該第四預設閾值,則進入步驟(d6); (d5) 再獲取一第五數量的同類別的可編輯字元樣本,並重複(d2)~ (d4)的步驟; (d6) 利用該驗證集驗證該字元提取模型的正確性,若該驗證集的驗證通過率小於一第五預設閾值,則進入步驟(d7),若該測試集的驗證通過率不小於該第五預設閾值,則進入步驟(d8); (d7) 再獲取一第六數量的同類別的可編輯字元樣本,並重複(d2)~ (d6)的步驟; (d8) 完成該第二類神經網路模型的訓練。The character image recognition system according to claim 5, wherein the second type of neural network model is trained in the training stage by the following steps: (d1) for the word in the file to be identified a meta-segment region, obtaining a fourth number of editable character samples of the character segmentation region; (d2) dividing the editable character samples into a training set, a test set, and a verification set; (d3 Using the training set as a training sample to generate a prototype of the second type of neural network model; (d4) verifying the correctness of the second type of neural network model using the test set, if the verification pass rate of the test set If it is less than a fourth preset threshold, proceed to step (d5). If the verification pass rate of the test set is not less than the fourth preset threshold, proceed to step (d6); (d5) obtain a fifth number of the same The editable character sample of the category, and repeat the steps of (d2)~(d4); (d6) verify the correctness of the character extraction model by using the verification set, if the verification pass rate of the verification set is less than a fifth pre- Set the threshold, then go to step (d7), if the test set is tested If the pass rate is not less than the fifth preset threshold, proceed to step (d8); (d7) obtain a sixth number of editable character samples of the same category, and repeat the steps of (d2)~(d6); D8) Complete the training of the second type of neural network model. 如申請專利範圍第1項所述之字元影像識別系統,其中該字元分割區域標籤模組是藉由一第三類神經網路模型對辨識出該字元分割區域於該待識別文件中的相對位置,並對該字元分割區域進行標籤。The character image recognition system of claim 1, wherein the character segmentation area tag module identifies the character segmentation area in the to-be-identified file by a third type of neural network model pair The relative position of the character and the label segmentation area. 如申請專利範圍第7項所述之字元影像識別系統,其中該第三類神經網路模型於訓練階段時,採用下述的步驟進行訓練: (e1) 針對該待識別文件中的該字元分割區域,獲取一第七數量的該字元分割區域的標籤與在該待識別文件中的相對位置; (e2) 將該些標籤樣本分成為一訓練集、一測試集、與一驗證集; (e3) 利用該訓練集作為訓練樣本,以生成該第三類神經網路模型的原型; (e4) 利用該測試集驗證該第三類神經網路模型的正確性,若該測試集的驗證通過率小於一第七預設閾值,則進入步驟(e5),若該測試集的驗證通過率不小於該第七預設閾值,則進入步驟(e6); (e5) 再獲取一第八數量的同類別的標籤樣本,並重複(e2)~ (e4)的步驟; (e6) 利用該驗證集驗證該第三類神經網路模型的正確性,若該驗證集的驗證通過率小於一第八預設閾值,則進入步驟(e7),若該測試集的驗證通過率不小於該第八預設閾值,則進入步驟(e8); (e7) 再獲取一第九數量的同類別的影像樣本,並重複(e2)~ (e6)的步驟; (e8) 完成該第三類神經網路模型的訓練。The character image recognition system according to claim 7, wherein the third type of neural network model is trained in the training stage by the following steps: (e1) for the word in the file to be identified a meta-segment area, obtaining a seventh number of labels of the character segmentation area and a relative position in the to-be-identified file; (e2) dividing the label samples into a training set, a test set, and a verification set (e3) using the training set as a training sample to generate a prototype of the third type of neural network model; (e4) using the test set to verify the correctness of the third type of neural network model, if the test set If the verification pass rate is less than a seventh preset threshold, proceed to step (e5). If the verification pass rate of the test set is not less than the seventh preset threshold, proceed to step (e6); (e5) obtain an eighth a quantity of label samples of the same category, and repeating the steps of (e2)~(e4); (e6) verifying the correctness of the third type of neural network model by using the verification set, if the verification pass rate of the verification set is less than one The eighth preset threshold is entered in step (e7), if If the verification pass rate of the test set is not less than the eighth preset threshold, proceed to step (e8); (e7) acquire a ninth number of image samples of the same category, and repeat the steps of (e2) to (e6); (e8) Complete the training of the third type of neural network model.
TW107217271U 2018-12-19 2018-12-19 Character image recognition system TWM578421U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW107217271U TWM578421U (en) 2018-12-19 2018-12-19 Character image recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW107217271U TWM578421U (en) 2018-12-19 2018-12-19 Character image recognition system

Publications (1)

Publication Number Publication Date
TWM578421U true TWM578421U (en) 2019-05-21

Family

ID=67353076

Family Applications (1)

Application Number Title Priority Date Filing Date
TW107217271U TWM578421U (en) 2018-12-19 2018-12-19 Character image recognition system

Country Status (1)

Country Link
TW (1) TWM578421U (en)

Similar Documents

Publication Publication Date Title
TWI703508B (en) Recognition method and system for character image
US8401301B2 (en) Property record document data verification systems and methods
CN110751143A (en) Electronic invoice information extraction method and electronic equipment
WO2018188199A1 (en) Method and device for identifying characters of claim settlement bill, server and storage medium
US8064703B2 (en) Property record document data validation systems and methods
US20170098192A1 (en) Content aware contract importation
WO2017041365A1 (en) Method and device for processing image information
US20070237427A1 (en) Method and system for simplified recordkeeping including transcription and voting based verification
US10528807B2 (en) System and method for processing and identifying content in form documents
CN113610068B (en) Test question disassembling method, system, storage medium and equipment based on test paper image
Thammarak et al. Automated data digitization system for vehicle registration certificates using google cloud vision API
CN109147002B (en) Image processing method and device
CN117195319A (en) Verification method and device for electronic part of file, electronic equipment and medium
TWM578421U (en) Character image recognition system
US20070217691A1 (en) Property record document title determination systems and methods
CN112396057A (en) Character recognition method and device and electronic equipment
CN116384344A (en) Document conversion method, device and storage medium
KR100957508B1 (en) System and method for recognizing optical characters
CN113705560A (en) Data extraction method, device and equipment based on image recognition and storage medium
CN113657373A (en) Automatic document cataloguing method
TWM607472U (en) Text section labeling system
TWI450203B (en) Character recognition translation system for picture and method thereof
US20240233430A9 (en) System to extract checkbox symbol and checkbox option pertaining to checkbox question from a document
CN114817163A (en) Exercise classification entry method and system and electronic equipment
TWI293737B (en)