TWI747450B - Character recognition method, electric device and computer program product - Google Patents
Character recognition method, electric device and computer program product Download PDFInfo
- Publication number
- TWI747450B TWI747450B TW109128285A TW109128285A TWI747450B TW I747450 B TWI747450 B TW I747450B TW 109128285 A TW109128285 A TW 109128285A TW 109128285 A TW109128285 A TW 109128285A TW I747450 B TWI747450 B TW I747450B
- Authority
- TW
- Taiwan
- Prior art keywords
- dot matrix
- image
- character
- preset
- images
- Prior art date
Links
Images
Landscapes
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
Description
本揭露是關於一種字元辨識方法,特別是先將影像轉換為點矩陣圖像再進行辨識的方法。This disclosure relates to a character recognition method, especially a method of first converting an image into a dot matrix image and then performing recognition.
在各種行業中,物品身分識別是製程、庫儲與物流作業中最關鍵的工作之一,物品的身分代碼可以用噴漆或印刷等方式形成在物品上,接著透過字元辨識的演算法來辨識出身分代碼。在一些習知技術中是以機器學習演算法來進行字元辨識,這些方法會面臨一些問題,例如需大量的訓練樣本、影像需前置處理、圖像殘缺或干擾等。如何解決這些問題為此領域技術人員所關心的議題。In various industries, item identification is one of the most critical tasks in manufacturing, storage, and logistics operations. The item's identification code can be formed on the item by spray painting or printing, and then identified through the character recognition algorithm Origin code. In some conventional technologies, machine learning algorithms are used for character recognition. These methods will face some problems, such as requiring a large number of training samples, pre-processing of images, incomplete image or interference, etc. How to solve these problems is a topic of concern to those skilled in the art.
本發明的實施例提出一種字元辨識方法,適用於電子裝置,此字元辨識方法包括:執行物件偵測演算法以偵測一影像中字元的位置;將字元輸入至圖像映對神經網路以輸出一點矩陣圖像,此點矩陣圖像包括多個二值化數值;以及辨識點矩陣圖像為多個字元類別的其中之一。An embodiment of the present invention provides a character recognition method suitable for electronic devices. The character recognition method includes: executing an object detection algorithm to detect the position of a character in an image; and inputting the character to the image mapping nerve The network outputs a dot matrix image, the dot matrix image includes a plurality of binarized values; and the identification dot matrix image is one of a plurality of character types.
在一些實施例中,上述的圖像映對神經網路為多層感知機(multilayer perception,MLP)或卷積神經網路。In some embodiments, the aforementioned image mapping neural network is a multilayer perception (MLP) or convolutional neural network.
在一些實施例中,辨識點矩陣圖像為字元類別的其中之一的步驟包括:取得多個預設點矩陣圖像,這些預設點矩陣圖像分別對應至字元類別;以及計算點矩陣圖像與每一個預設點矩陣圖像之間的差異或相似度以將字元辨識為對應的字元類別。In some embodiments, the step of recognizing that the dot matrix image is one of the character types includes: obtaining a plurality of preset dot matrix images, the preset dot matrix images respectively corresponding to the character type; and calculating the dots The difference or similarity between the matrix image and each preset dot matrix image is used to identify the character as the corresponding character category.
在一些實施例中,上述取得預設點矩陣圖像的步驟包括:取得多個標準點矩陣圖像,改變標準點矩陣圖像中二值化數值的分佈以產生預設點矩陣圖像,其中預設點矩陣圖像的總平均差大於標準點矩陣圖像的總平均差。In some embodiments, the step of obtaining a preset dot matrix image includes: obtaining a plurality of standard dot matrix images, and changing the distribution of binarization values in the standard dot matrix image to generate the preset dot matrix image, wherein The total average difference of the preset dot matrix image is greater than the total average difference of the standard dot matrix image.
在一些實施例中,上述辨識點矩陣圖像為字元類別的其中之一的步驟包括:將點矩陣圖像輸入至一類神經網路以辨識點矩陣圖像為字元類別的其中之一。In some embodiments, the step of recognizing the dot matrix image as one of the character types includes: inputting the dot matrix image into a type of neural network to recognize the dot matrix image as one of the character types.
以另一個角度來說,本發明的實施例提出一種電子裝置,包括記憶體與處理器。記憶體儲存多個指令,處理器用以執行指令以完成多個步驟:執行物件偵測演算法以偵測一影像中字元的位置;將字元輸入至一圖像映對神經網路以輸出點矩陣圖像,此點矩陣圖像包括多個二值化數值;以及辨識點矩陣圖像為多個字元類別的其中之一。From another perspective, an embodiment of the present invention provides an electronic device including a memory and a processor. The memory stores multiple instructions, and the processor executes the instructions to complete multiple steps: execute object detection algorithm to detect the position of characters in an image; input characters to an image mapping neural network to output points A matrix image, the dot matrix image includes a plurality of binarized values; and the recognition dot matrix image is one of a plurality of character types.
以另一個角度來說,本發明的實施例提出一種電腦程式產品,由一電腦系統載入並執行以完成多個步驟:執行物件偵測演算法以偵測一影像中字元的位置;將字元輸入至一圖像映對神經網路以輸出點矩陣圖像,此點矩陣圖像包括多個二值化數值;以及辨識點矩陣圖像為多個字元類別的其中之一。From another perspective, the embodiment of the present invention provides a computer program product that is loaded and executed by a computer system to complete multiple steps: execute an object detection algorithm to detect the position of characters in an image; The element is input to an image-mapping neural network to output a dot matrix image, the dot matrix image includes a plurality of binary values; and the dot matrix image is recognized as one of a plurality of character types.
在上述的字元辨識方法中,圖像映對神經網路可以獨立訓練與應用且不需要太多訓練樣本。In the above-mentioned character recognition method, the image-mapping neural network can be independently trained and applied without too many training samples.
為讓本發明的上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail in conjunction with the accompanying drawings.
關於本文中所使用之「第一」、「第二」等,並非特別指次序或順位的意思,其僅為了區別以相同技術用語描述的元件或操作。Regarding the “first”, “second”, etc. used in this text, it does not particularly mean the order or sequence, but only distinguishes elements or operations described in the same technical terms.
圖1是根據一實施例繪示電子裝置的示意圖。請參照圖1,電子裝置100可以是智慧型手機、平板電腦、個人電腦、筆記型電腦、伺服器、工業電腦或具有計算能力的各種電子裝置等,本發明並不在此限。電子裝置100包括了處理器110與記憶體120,處理器110通訊連接至記憶體120,其中處理器110可為中央處理器、微處理器、微控制器、影像處理晶片、特殊應用積體電路等,記憶體120可為隨機存取記憶體、唯讀記憶體、快閃記憶體、軟碟、硬碟、光碟、隨身碟、磁帶或是可透過網際網路存取之資料庫,其中儲存有多個指令,處理器110會執行這些指令來完成一字元辨識方法,以下將詳細說明此字元辨識方法。Fig. 1 is a schematic diagram illustrating an electronic device according to an embodiment. Please refer to FIG. 1, the
圖2是根據一實施例繪示字元辨識方法的流程圖。請參照圖2,在此是要辨識影像210中的字元,在此實施例中影像210是關於鋼卷,鋼卷上有印刷的辨識碼,但在其他實施例中所要辨識的影像也可以關於其他任意的物品,所要辨識的字元可以用刻印、印刷、噴漆等方式形成在物品上,本發明並不限制影像210的內容。Fig. 2 is a flowchart illustrating a character recognition method according to an embodiment. Please refer to FIG. 2, where the characters in the
在步驟220中,執行一物件偵測演算法以偵測影像210中至少一個字元的位置。在一些實施例中,此物件偵測演算法可以是卷積神經網路、支持向量機等,本發明並不在此限。在一些實施例中,上述的卷積神經網路為YOLO(you only look once),可為任意的版本,此卷積神經網路已經事先訓練好,可以偵測出字元的位置。在偵測出字元的位置以後,可以將這些字元切割下來,在此實施例中所偵測到的字元為“52619103”,將字元“6”切割下來以後可成為影像230。在圖2中對於每個字元都繪示了對應的邊界框(bounding box)以清楚表示字元的位置,但在後續的處理中影像230並不需要包括邊界框。In
在步驟240中,將所偵測到的每個字元都輸入至一圖像映對神經網路以輸出一點矩陣圖像,點矩陣圖像包括了矩陣排列的二值化數值,例如“1”代表黑色,“0”代表白色。在此例子中共偵測到“52619103”等8個字元,每個字元都會輸入至圖像映對神經網路,因此共產生8個點矩陣圖像,舉例來說,點矩陣圖像250便包含了代表字元“6”的二值化數值。在一些實施例中點矩陣圖像的寬度為5、高度為7,也就是說每個字元是由5x7排列的二值化數值所組成,但本發明並不限制點矩陣圖像的寬度與高度,在其他實施例中圖像映對神經網路也可以輸出更大或更小的點矩陣圖像。上述的圖像映對神經網路例如為卷積神經網路或是多層感知機(multilayer perception,MLP),本發明並不在此限。以多層感知機為例,此多層感知機的輸入層接收了影像230的每個像素,輸出層包含了35個神經元,這些神經元分別輸出35個二值化數值,這35個二值化數值可組成一個字元的點矩陣圖像。在一些實施例中,多層感知機只包含輸入層與輸出層等兩層,相較於習知技術採用YOLO來辨識字元來說,在此採用多層感知機可以減少對於訓練樣本數目的需求,也就是說即使只有極少量的訓練樣本依然可以達到很高的準確率。In
在一些實施例中可以將影像230(不包括邊界框)直接輸入至圖像映對神經網路以輸出點矩陣圖像250。在一些實施例中,可以從YOLO中取得字元對應的特徵圖(feature map)以輸入至圖像映對神經網路。在一些實施例中,輸入至圖像映對神經網路的影像或特徵圖可以經過升取樣、降取樣、改變寬高比等操作,本發明並不在此限。In some embodiments, the image 230 (excluding the bounding box) may be directly input to the image mapping neural network to output the
在步驟260中,辨識點矩陣圖像為多個字元類別的其中之一,這些字元類別例如包括“0”~“9”等10個類別。在一些實施例中,可以將點矩陣圖像250輸入至一類神經網路以辨識出字元類別,本發明並不限制此類神經網路的類型與架構。在一些實施例中,可以事先取得多個預設點矩陣圖像,這些預設點矩陣圖像對應至所有的字元類別,然後計算點矩陣圖像250與每個預設點矩陣圖像之間的差異或相似度來辨識出字元類別。以下將說明預設點矩陣圖像。In
請參照圖2與圖3,圖3繪示了一般電子儀器或計算機常常採用的5x7點矩陣圖像310(亦稱標準點矩陣圖像),分別代表字元“0”至字元“9”。如果詳細分析標準點矩陣圖像之間的差異,差異的計算公式如以下數學式(1)所示。 [數學式1] Please refer to Figures 2 and 3. Figure 3 shows a 5x7 dot matrix image 310 (also known as a standard dot matrix image) commonly used in general electronic instruments or computers, representing the characters "0" to "9" respectively . If the difference between the standard dot matrix images is analyzed in detail, the calculation formula for the difference is shown in the following mathematical formula (1). [Math 1]
其中
為第i個字元與第j個字元之間的差異,N是字元類別的數量,
為第i個字元中位於座標(x,y)的二值化數值,
為第j個字元中位於座標(x,y)的二值化數值,例如“1”代表黑色,“0”代表白色,
為點矩陣圖像的寬度,
為點矩陣圖像的高度。計算結果請參照圖3的表格320,舉例來說,字元“0”與字元“1”之間的差異為4.2;字元“0”與字元“2”之間的差異為3.5,以此類推。此外,若把關於字元“0”所有的差異平均起來可以得到第12列的平均差,字元“0”的平均差為3.36;字元“1”的平均差為3.92,以此類推。從表格320中可以看出字元“0”、“5”、“6”、“8”、“9”的平均差相對比較小,也就是說比較難辨識出這些字元。表格420的最後一列是總平均差,是將所有字元的平均差再取平均所得出,在此例子中標準點矩陣圖像310的總平均差為3.52。
in Is the difference between the i-th character and the j-th character, N is the number of character categories, Is the binarized value at the coordinate (x, y) in the i-th character, Is the binarized value at the coordinate (x, y) in the j-th character, for example, "1" represents black, and "0" represents white, Is the width of the dot matrix image, Is the height of the dot matrix image. Please refer to the table 320 in Figure 3 for the calculation result. For example, the difference between the character "0" and the character "1" is 4.2; the difference between the character "0" and the character "2" is 3.5, And so on. In addition, if all the differences about the character "0" are averaged, the average difference of the 12th column can be obtained. The average difference of the character "0" is 3.36; the average difference of the character "1" is 3.92, and so on. It can be seen from the table 320 that the average difference of the characters "0", "5", "6", "8", and "9" is relatively small, that is, it is difficult to identify these characters. The last column of the table 420 is the total average difference, which is obtained by averaging the average difference of all characters. In this example, the total average difference of the standard
在此實施例中為了讓圖像映對神經網路輸出的點矩陣圖像有較大的差異以提高字元的辨識度,會改變標準點矩陣圖像310中二值化數值的分佈以產生新的點矩陣圖像(稱為預設點矩陣圖像)。請參照圖4,其中繪示了新產生的點矩陣圖像410,這些字元之間的差異如表格420所示,其繪製方式相同於表格320。舉例來說,字元“0”與字元“1”之間的差異為4;字元“0”與字元“2”之間的差異為4.2,以此類推。相同地,把關於同一字元所有的差異平均起來可以得到第12列的平均差,字元“0”的平均差為4.21;字元“1”的平均差為4.26,以此類推。最後,點矩陣圖像410所對應的總平均差為4.26。比較表格320與表格420可以發現,預設點矩陣圖像410的總平均差會大於標準點矩陣圖像310的總平均差,也就是說預設點矩陣圖像410有較佳的辨識度,經實驗測試,這可加速圖像映對神經網路的收斂速度且降低字元“0”、“5”、“6”、“8”、“9”的誤判率,進而提高系統的辨識正確率。In this embodiment, in order to make the image mapping to the dot matrix image output by the neural network have a larger difference to improve the recognition of characters, the distribution of the binarized values in the standard
值得注意的是當採用預設點矩陣圖像410時,在訓練圖像映對神經網路時是把預設點矩陣圖像410當作真實輸出(ground truth),也就是說在圖2的步驟240所輸出的點矩陣圖像會類似預設點矩陣圖像410,但為了清楚表達起見圖2並沒有繪示預設點矩陣圖像410,而是繪示標準的點矩陣圖像。It is worth noting that when the preset
在步驟260的一實施例中會計算點矩陣圖像250與每個預設點矩陣圖像之間的差異或相似度,差異的計算公式如以下數學式(2)所示,相似度的計算公式則如數學式(3)所示。
[數學式2]
[數學式3]
In an embodiment of
其中
為圖像映對神經網路輸出的點矩陣圖像(例如點矩陣圖像250)中位於座標(x,y)的二值化數值,
為第i個預設點矩陣圖像中位於座標(x,y)的二值化數值,
為點矩陣圖像250與第i個預設點矩陣圖像之間的差異,
點矩陣圖像250與第i個預設點矩陣圖像之間的相似度。若採用差異,則可以採用最小的差異
,也就是說最小差異
所對應的第i個字元類別是點矩陣圖像250所屬的字元類別;若採用相似度,則可以採用最大的相似度
,也就是說最大相似度
所對應的第i個字元類別是點矩陣圖像250所屬的字元類別。對於圖像映對神經網路所輸出的每個點矩陣圖像都可經過上述的運算找到所屬的字元類別。
in Is the binarized value at the coordinate (x, y) in the dot matrix image (for example, the dot matrix image 250) output by the image mapping neural network, Is the binarized value at the coordinate (x, y) in the i-th preset point matrix image, Is the difference between the
上述實施例的一個優點在於圖像映對神經網路可以獨立訓練與應用,在使用機器學習演算法時的一大困難便在於取得訓練樣本,習知技術採用YOLO等演算法來辨識字元,由於這些演算法中具有大量的參數,因此也需要大量的訓練樣本,在此實施例中可以獨立地訓練出所需要的圖像映對神經網路,而且此網路不需要太多訓練樣本就可以有很好的辨識率。當待辨識字元圖像的字型改變時,利用上述架構在圖像映對神經網路沒有重新訓練的情況下亦可得到不錯的辨識結果。再者,將字元轉換為點矩陣圖像的步驟不需要前處理、殘缺填補、雜訊率除等處理就可以有很好的結果,實驗結果顯示即使影像中有干擾(例如鋼帶跨過字元)的情形,系統整體的辨識率依然很高。One advantage of the above-mentioned embodiment is that the image mapping neural network can be independently trained and applied. A major difficulty when using machine learning algorithms is obtaining training samples. The conventional technology uses algorithms such as YOLO to recognize characters. Due to the large number of parameters in these algorithms, a large number of training samples are also required. In this embodiment, the required image mapping neural network can be independently trained, and this network does not require too many training samples. Has a very good recognition rate. When the font of the character image to be recognized changes, the above-mentioned architecture can also be used to obtain good recognition results without retraining the image mapping neural network. Furthermore, the step of converting characters into a dot matrix image does not require pre-processing, incomplete filling, noise rate division and other processing to have good results. The experimental results show that even if there is interference in the image (such as steel strip crossing In the case of character), the overall recognition rate of the system is still very high.
上述圖2中的各步驟可以實作為多個程式碼或是電路,本發明並不在此限。以另外一個角度來說,本發明也提出了一電腦程式產品,此產品可由任意的程式語言及/或平台所撰寫,當此電腦程式產品被載入至電腦系統並執行時,可執行圖2的字元辨識方法。The steps in FIG. 2 above can be implemented as multiple program codes or circuits, and the present invention is not limited thereto. From another perspective, the present invention also proposes a computer program product, which can be written in any programming language and/or platform. When the computer program product is loaded into the computer system and executed, it can execute Figure 2 Character recognition method.
在一些實施例中,上述的字元辨識方法也可以用來辨識英文字母或任意合適的字元。In some embodiments, the aforementioned character recognition method can also be used to recognize English letters or any suitable characters.
雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the technical field can make some changes and modifications without departing from the spirit and scope of the present invention. The scope of protection of the present invention shall be subject to those defined by the attached patent scope.
100:電子裝置
110:處理器
120:記憶體
210,230:影像
220,240,260:步驟
250,310,410:點矩陣圖像
320,420:表格
100: electronic device
110: processor
120: memory
210,230:
[圖1]是根據一實施例繪示電子裝置的示意圖。 [圖2]是根據一實施例繪示字元辨識方法的流程圖。 [圖3]是根據一實施例繪示標準點矩陣圖像的示意圖以及其差異表格。 [圖4]是根據一實施例繪示預設點矩陣圖像的示意圖以及其差異表格。 [Fig. 1] is a schematic diagram showing an electronic device according to an embodiment. [Fig. 2] is a flowchart of a character recognition method according to an embodiment. [Fig. 3] is a schematic diagram showing a standard dot matrix image and its difference table according to an embodiment. [Fig. 4] is a schematic diagram of a preset dot matrix image and its difference table according to an embodiment.
210,230:影像 210,230: image
220,240,260:步驟 220, 240, 260: steps
250:點矩陣圖像 250: dot matrix image
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109128285A TWI747450B (en) | 2020-08-19 | 2020-08-19 | Character recognition method, electric device and computer program product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109128285A TWI747450B (en) | 2020-08-19 | 2020-08-19 | Character recognition method, electric device and computer program product |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI747450B true TWI747450B (en) | 2021-11-21 |
TW202209174A TW202209174A (en) | 2022-03-01 |
Family
ID=79907783
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW109128285A TWI747450B (en) | 2020-08-19 | 2020-08-19 | Character recognition method, electric device and computer program product |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI747450B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201001303A (en) * | 2008-06-27 | 2010-01-01 | Univ Nat Taiwan Science Tech | System and method for recognizing document immediately |
TWI607387B (en) * | 2016-11-25 | 2017-12-01 | 財團法人工業技術研究院 | Character recognition systems and character recognition methods thereof |
CN109977723A (en) * | 2017-12-22 | 2019-07-05 | 苏宁云商集团股份有限公司 | Big bill picture character recognition methods |
-
2020
- 2020-08-19 TW TW109128285A patent/TWI747450B/en active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201001303A (en) * | 2008-06-27 | 2010-01-01 | Univ Nat Taiwan Science Tech | System and method for recognizing document immediately |
TWI607387B (en) * | 2016-11-25 | 2017-12-01 | 財團法人工業技術研究院 | Character recognition systems and character recognition methods thereof |
CN109977723A (en) * | 2017-12-22 | 2019-07-05 | 苏宁云商集团股份有限公司 | Big bill picture character recognition methods |
Also Published As
Publication number | Publication date |
---|---|
TW202209174A (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10853638B2 (en) | System and method for extracting structured information from image documents | |
US10685462B2 (en) | Automatic data extraction from a digital image | |
US7653244B2 (en) | Intelligent importation of information from foreign applications user interface | |
US9904847B2 (en) | System for recognizing multiple object input and method and product for same | |
CN109740606B (en) | Image identification method and device | |
RU2757713C1 (en) | Handwriting recognition using neural networks | |
US20070133877A1 (en) | Script recognition for ink notes | |
CN111626177B (en) | PCB element identification method and device | |
CN110598686A (en) | Invoice identification method, system, electronic equipment and medium | |
JP6877446B2 (en) | Systems and methods for recognizing multiple object structures | |
CN109389050B (en) | Method for identifying connection relation of flow chart | |
Putra et al. | Structural off-line handwriting character recognition using approximate subgraph matching and levenshtein distance | |
Ghosh et al. | Language-invariant novel feature descriptors for handwritten numeral recognition | |
CN113711232A (en) | Object detection and segmentation for inking applications | |
US11393231B2 (en) | System and method for text line extraction | |
Rajnoha et al. | Offline handwritten text recognition using support vector machines | |
JP7364639B2 (en) | Processing of digitized writing | |
CN111738167A (en) | Method for recognizing unconstrained handwritten text image | |
US20150139547A1 (en) | Feature calculation device and method and computer program product | |
CN113610809A (en) | Fracture detection method, fracture detection device, electronic device, and storage medium | |
TWI747450B (en) | Character recognition method, electric device and computer program product | |
CN112465050A (en) | Image template selection method, device, equipment and storage medium | |
CN111639636A (en) | Character recognition method and device | |
CN114120305B (en) | Training method of text classification model, and text content recognition method and device | |
US11455179B1 (en) | Processing system and processing method for performing emphasis process on button object of user interface |