TWI612479B

TWI612479B - Character image recognition system and method for recognizing character image

Info

Publication number: TWI612479B
Application number: TW105114275A
Authority: TW
Inventors: 蔡祈岩; 郭峻成
Original assignee: 有無科技股份有限公司
Priority date: 2016-05-09
Filing date: 2016-05-09
Publication date: 2018-01-21
Also published as: TW201740307A

Abstract

一種文字影像辨識系統包含字詞資料庫、影像擷取裝置、顯示螢幕及處理器。字詞資料庫儲存第一字詞資料組與第二字詞資料組。顯示螢幕根據影像擷取裝置所擷取之影像資料顯示複數個即時預覽影像。處理器使顯示螢幕於複數個即時預覽影像中標記出目標區塊，自複數個即時預覽影像中選取目標影像，根據第一字詞資料組辨識目標影像中對應於目標區塊內的文字影像，並根據第二字詞資料組辨識目標影像中對應於目標區塊外的文字影像。第一字詞資料組與第二字詞資料組為相異之字詞資料組。A text image recognition system includes a word database, an image capture device, a display screen, and a processor. The word database stores the first word data set and the second word data set. The display screen displays a plurality of instant preview images according to the image data captured by the image capturing device. The processor causes the display screen to mark the target block in the plurality of instant preview images, selects the target image from the plurality of instant preview images, and identifies the text image corresponding to the target block in the target image according to the first word data group. And identifying, according to the second word data group, the text image corresponding to the target block in the target image. The first word data group and the second word data group are different word data groups.

Description

Text image recognition system and method for recognizing text images

本發明係有關於一種文字影像辨識系統，特別是一種能夠利用相異字詞資料組辨識影像中相異區塊之文字影像的文字影像辨識系統。The invention relates to a character image recognition system, in particular to a character image recognition system capable of recognizing a text image of a different block in an image by using a different word data group.

一般而言，為了能夠有效管理文件，將文件影像化並將影像中的文字電子化是常見且重要的過程。先前技術在對影像中的文字進行辨識時，會將分析影像內容後所取得的影像特徵與通用字詞庫裡儲存的字詞特徵來進行比對與辨識，而通用字詞庫中所儲存的字詞可能包含各種常用的字詞或甚至各種領域的專有名詞。然而因為通用字詞庫裡的字詞資料與影像內容並無對應關係，因此直接將分析影像後所得的影像特徵與通用字詞庫裡的字詞相進行比對，不僅可能耗費不必要的運算資源，也未必能得到滿意的成果。In general, in order to be able to manage files efficiently, it is a common and important process to visualize documents and electronically digitize the images. In the prior art, when the text in the image is recognized, the image features obtained after analyzing the image content are compared and recognized by the word features stored in the general word library, and the words stored in the general word library are stored. Words may contain a variety of commonly used words or even proper nouns in various fields. However, since the word data in the general word dictionary does not correspond to the image content, directly comparing the image features obtained by analyzing the image with the words in the general word database may not only consume unnecessary computing resources, but also consume unnecessary computing resources. It may not be able to achieve satisfactory results.

此外，在許多應用中，影像的內容可能為制式的表格文件，而表格中的不同欄位常會記載不同類型的資訊，舉例來說，信封或包裹上常會利用不同的欄位來記載與郵務傳遞相關的資訊，例如收件人欄位及收件地址欄位即會分別記載姓名及地址資訊。然而這兩種資訊所常用的字詞則有相當差異，舉例來說，在地址資訊中常見的字詞可能包含「XX市」、「XX區」、「XX路」…等等，而這些字詞一般則較少出現在人名當中。在此情況下，若仍用使用通用字詞庫中所記錄的字詞來辨識表格中不同欄位的文字，則容易浪費不必要的運算資源在比對不相關的字詞，而且也可能會造成誤判，導致在取得辨識結果之後，還需以人工的方式進行校正，反而造成使用上的不便。因此，如何有效地辨識影像中的文字即成為了一個有待解決的問題。In addition, in many applications, the content of the image may be a standard form file, and different fields in the form often record different types of information. For example, envelopes or packages often use different fields to record and post. Passing relevant information, such as the recipient field and the recipient address field, will record the name and address information separately. However, the words commonly used in these two kinds of information are quite different. For example, the words commonly found in address information may include "XX City", "XX Area", "XX Road", etc., and these words. Words are generally less common in names. In this case, if you use the words recorded in the general-purpose word library to recognize the text of different fields in the table, it is easy to waste unnecessary computing resources in the unrelated words, and may also The misjudgment is caused, and after the identification result is obtained, the correction is also required by manual means, which causes inconvenience in use. Therefore, how to effectively recognize the text in the image becomes a problem to be solved.

本發明之一實施例提供一種文字影像辨識系統，文字影像辨識系統包含字詞資料庫、影像擷取裝置、顯示螢幕及處理器。An embodiment of the present invention provides a character image recognition system. The character image recognition system includes a word database, an image capture device, a display screen, and a processor.

字詞資料庫儲存有第一字詞資料組與第二字詞資料組，且第一字詞資料組與第二字詞資料組為相異之字詞資料組。影像擷取裝置可擷取目標影像。顯示螢幕耦接於影像擷取裝置，並可顯示目標影像。處理器耦接於影像擷取裝置、顯示螢幕及字詞資料庫，處理器可使顯示螢幕於目標影像中標記出目標區塊，根據第一字詞資料組辨識目標影像中對應於目標區塊內的文字影像，及根據第二字詞資料組辨識目標影像中對應於目標區塊外的文字影像。The word database stores the first word data group and the second word data group, and the first word data group and the second word data group are different word data groups. The image capturing device can capture the target image. The display screen is coupled to the image capturing device and can display the target image. The processor is coupled to the image capturing device, the display screen and the word database, and the processor can mark the target screen by the display screen in the target image, and identify the target image corresponding to the target block according to the first word data group. The text image inside, and the character image corresponding to the target block in the target image is identified according to the second word data group.

本發明之另一實施例提供一種辨識文字影像的方法，辨識文字影像的方法包含影像擷取裝置擷取目標影像，顯示螢幕顯示目標影像，處理器使顯示螢幕於目標影像中標記出目標區塊，處理器自複數個即時預覽影像中選取目標影像，處理器根據第一字詞資料組辨識目標影像中對應於目標區塊內的文字影像內容，及處理器根據第二字詞資料組辨識目標影像對應於目標區塊外的文字影像內容。第一字詞資料組與第二字詞資料組為相異之字詞資料組。Another embodiment of the present invention provides a method for recognizing a text image. The method for recognizing a text image includes an image capturing device capturing a target image, displaying a screen display target image, and causing the display screen to mark the target block in the target image. The processor selects the target image from the plurality of instant preview images, and the processor identifies the text image content corresponding to the target block in the target image according to the first word data group, and the processor identifies the target according to the second word data group. The image corresponds to the text image content outside the target block. The first word data group and the second word data group are different word data groups.

第1圖為本發明一實施例之文字影像辨識系統100的示意圖。文字影像辨識系統100包含字詞資料庫110、影像擷取裝置120、顯示螢幕130及處理器140。顯示螢幕130耦接於影像擷取裝置120，處理器140耦接於影像擷取裝置120、顯示螢幕130及字詞資料庫110。FIG. 1 is a schematic diagram of a character image recognition system 100 according to an embodiment of the present invention. The character image recognition system 100 includes a word database 110, an image capturing device 120, a display screen 130, and a processor 140. The display screen 130 is coupled to the image capturing device 120. The processor 140 is coupled to the image capturing device 120, the display screen 130, and the word database 110.

影像擷取裝置120可擷取影像資料，而顯示螢幕130則可根據影像擷取裝置120所擷取之影像資料顯示複數個即時預覽影像，且處理器140可使顯示螢幕130於其所顯示的即時預覽影像中標記出目標區塊。換言之，顯示螢幕130可以即時地顯示影像擷取裝置120所擷取的影像，並在顯示畫面上標記出即時預覽影像的目標區塊。The image capturing device 120 can capture image data, and the display screen 130 can display a plurality of instant preview images according to the image data captured by the image capturing device 120, and the processor 140 can display the display screen 130 on the display screen. The target block is marked in the live preview image. In other words, the display screen 130 can instantly display the image captured by the image capturing device 120, and mark the target block of the instant preview image on the display screen.

第2圖為本發明一實施例之顯示螢幕130的顯示畫面。在第2圖中，顯示螢幕130顯示影像擷取裝置120所擷取的即時預覽影像IMG1，且根據處理器140的控制，顯示螢幕130還會在即時預覽影像IMG1顯示目標區塊T1的邊界以標記出目標區塊T1。其中，顯示在顯示螢幕130中目標區塊T1為占有一預定比例，而該預定比例可為30~50%，其中以40%為最佳比例。目標區塊T1的形狀最佳為矩形狀，但沒有限制，以配合習慣上的收件人地址呈現習慣，同時可提供多種習慣供使用者選擇，並經選擇後該選擇的形狀成為預設的。FIG. 2 is a display screen of the display screen 130 according to an embodiment of the present invention. In the second figure, the display screen 130 displays the instant preview image IMG1 captured by the image capturing device 120, and according to the control of the processor 140, the display screen 130 also displays the boundary of the target block T1 in the instant preview image IMG1. The target block T1 is marked. The target block T1 is displayed in the display screen 130 to occupy a predetermined ratio, and the predetermined ratio may be 30 to 50%, wherein 40% is the optimal ratio. The shape of the target block T1 is preferably rectangular, but there is no limitation to match the customary address of the recipient, and a variety of habits can be provided for the user to select, and the selected shape becomes preset after selection. .

在本發明的部分實施例中，顯示螢幕130可為觸控螢幕。而處理器140即可根據使用者於顯示螢幕130的觸控操作，使顯示螢幕130於即時預覽影像IMG1中標記出目標區塊T1，並可根據使用者於顯示螢幕的觸控操作調整目標區塊T1的大小。舉例來說，使用者可用點擊的方式使顯示螢幕140顯示所標記的目標區塊T1的邊界，利用拖曳的方式調整目標區塊T1的位置，或可藉由手指在顯示螢幕140上的併攏或伸張來調整目標區塊T1的大小，當然在本發明的其他實施例中，使用者也可能透過不同的觸控方式來控制處理器140，並使處理器140能夠對應地調整目標區塊T1的位置及/或調整目標區塊T1的大小。In some embodiments of the present invention, the display screen 130 can be a touch screen. The processor 140 can mark the target tile T1 in the instant preview image IMG1 according to the touch operation of the user on the display screen 130, and adjust the target area according to the touch operation of the user on the display screen. The size of block T1. For example, the user can click on the display screen 140 to display the boundary of the marked target block T1, adjust the position of the target block T1 by dragging, or can be closed by the finger on the display screen 140 or Extending the size of the target block T1, of course, in other embodiments of the present invention, the user may control the processor 140 through different touch modes, and enable the processor 140 to adjust the target block T1 correspondingly. Position and/or adjust the size of the target block T1.

如此一來，使用者即可將顯示螢幕130上所標記的目標區塊T1對準所欲處理之影像的特定區域，並使目標區塊T1能夠包含所欲處理之影像的特定區域。舉例來說，在第2圖中，即時預覽影像IMG1的內容為包裹上的郵務資料表單，而即時預覽影像IMG1的特定區域可為郵務資料表單中的收件人地址欄位A1。由於在第2圖中，目標區塊T1尚未完整包含即時預覽影像IMG1中的收件人地址欄位A1，因此使用者可能需要持續地調整影像擷取裝置120即時預覽影像的角度，同時也可能需要透過上述的方式來調整目標區塊T1的大小。由於目標區塊T1可完整包含即時預覽影像IMG1中的收件人地址欄位A1，且不會包含到收件人地址欄位A1以外的資訊(例如商標圖案)，就不會浪費不必要的運算資源在比對不相關的字詞，也不容易造成誤判。In this way, the user can align the target block T1 marked on the display screen 130 with a specific area of the image to be processed, and enable the target block T1 to contain a specific area of the image to be processed. For example, in FIG. 2, the content of the instant preview image IMG1 is the postal data form on the package, and the specific area of the instant preview image IMG1 may be the recipient address field A1 in the postal data form. Since in the second figure, the target block T1 does not completely contain the address field A1 of the instant preview image IMG1, the user may need to continuously adjust the angle of the image capturing device 120 to preview the image in real time, and may also It is necessary to adjust the size of the target block T1 by the above method. Since the target block T1 can completely contain the address field A1 of the instant preview image IMG1 and does not contain information other than the address field A1 (for example, a trademark pattern), unnecessary waste is not wasted. Computational resources are not easy to cause misjudgment in terms that are not related to the comparison.

在使用者調整影像擷取裝置120及顯示螢幕130，並使得目標區塊T1能夠包含影像擷取裝置120後續所擷取之郵務資料表單中收件人地址欄位A1的內容之後，使用者即可控制處理器140以選取此時顯示螢幕130所顯示的即時預覽影像作為之後進行文字影像辨識的目標影像。換言之，使用者可根據顯示螢幕130上所顯示的即時預覽畫面來動態地調整影像擷取裝置120的位置以及目標區塊T1的大小，並在調整完成後，使處理器140執行拍攝影像的功能，此時拍攝取得的影像即為目標影像。在本發明的部分實施例中，處理器140還可將選取的目標影像儲存至記憶體中。After the user adjusts the image capturing device 120 and the display screen 130, and the target block T1 can include the content of the recipient address field A1 in the postal data form retrieved by the image capturing device 120, the user The processor 140 can be controlled to select the instant preview image displayed on the display screen 130 at this time as the target image for character image recognition. In other words, the user can dynamically adjust the position of the image capturing device 120 and the size of the target block T1 according to the live preview screen displayed on the display screen 130, and after the adjustment is completed, cause the processor 140 to perform the function of capturing the image. At this time, the image obtained by the shooting is the target image. In some embodiments of the present invention, the processor 140 may also store the selected target image into the memory.

此外，在本發明的其他實施例中，使用者亦可在擷取了適當的目標影像TIMG1之後，再進一步透過處理器140來調整目標區塊T1的位置及/或大小。In addition, in other embodiments of the present invention, the user may further adjust the position and/or size of the target block T1 through the processor 140 after extracting the appropriate target image TIMG1.

第3圖為本發明一實施例之處理器140所選取之目標影像TIMG1。在目標影像TIMG1中，目標區塊T1會包含所欲處理的特定區塊，亦即例如前述的收件人地址欄位A1。針對目標影像TIMG1對應於目標區塊T1內的影像內容以及目標影像TIMG1對應於目標區塊T1外的影像內容處理器140可分別使用不同的字詞資料組來加以辨識。FIG. 3 is a target image TIMG1 selected by the processor 140 according to an embodiment of the present invention. In the target image TIMG1, the target block T1 will contain a specific block to be processed, that is, for example, the aforementioned recipient address field A1. The image content processor 140 corresponding to the target image TMG1 corresponding to the image content in the target block T1 and the target image TIMG1 corresponding to the target block T1 can be identified by using different word data sets respectively.

舉例來說，經過使用者的調整，目標影像TIMG1對應於目標區塊T1內的內容應已包含了收件人地址欄位A1的資料。此時，處理器140可根據與地址相關的複數個字詞資料來辨識目標影像TIMG1中對應於目標區塊T1內的收件地址，並可根據標準常用之複數個字詞資料辨識目標影像TIMG1中對應於目標區塊T1外的其他郵件相關資料，例如收件人姓名或公司行號。For example, after the user's adjustment, the content of the target image TIMG1 corresponding to the target block T1 should already contain the data of the address field A1 of the recipient. At this time, the processor 140 may identify the destination address corresponding to the target block T1 in the target image TIMG1 according to the plurality of word data related to the address, and identify the target image TIMG1 according to a plurality of word data commonly used by the standard. The other mail related material corresponding to the target block T1, such as the recipient name or the company line number.

在本發明的部分實施例中，字詞資料庫110可儲存第一字詞資料組112與第二字詞資料組114，且第一字詞資料組與該第二字詞資料組為相異之字詞資料組。而在上述實施例中，第一字詞資料組112可為與地址相關的複數個字詞資料，而第二字詞資料組114可為標準常用之複數個字詞資料，而處理器140則可根據第一字詞資料組112辨識目標影像TIMG1中對應於目標區塊T1內的文字影像，並根據第二字詞資料組114辨識目標影像TIMG1中對應於目標區塊T1外的文字影像。In some embodiments of the present invention, the word database 110 may store the first word data group 112 and the second word data group 114, and the first word data group and the second word data group are different. The word data group. In the above embodiment, the first word data group 112 may be a plurality of word materials related to the address, and the second word data group 114 may be a plurality of word materials commonly used by the standard, and the processor 140 The character image corresponding to the target block T1 in the target image TIMG1 may be identified according to the first word data group 112, and the text image corresponding to the target block T1 in the target image TIMG1 may be identified according to the second word data group 114.

在本發明的部分實施例中，處理器140可對目標影像TIMG1中對應於目標區塊T1內的影像先進行影像處理，例如透過邊界偵測(edge detection)將影像中的文字獨立開來，並接著計算每一個文字的影像特徵，例如可計算每一個文字所具有的邊界數量、邊界與邊界的交點數量、邊界與邊界的夾角等數值來做為每一個文字的影像特徵。In some embodiments of the present invention, the processor 140 may perform image processing on the image corresponding to the target block T1 in the target image TIMG1, for example, by using edge detection to separate the characters in the image. Then, the image features of each character are calculated. For example, the number of boundaries of each character, the number of intersections of the boundary and the boundary, and the angle between the boundary and the boundary can be calculated as the image features of each character.

在此情況下，第一字詞資料組112則可包含根據上述方式所取得之與地址相關的複數個字詞的影像特徵資料，例如「台北」、「高雄」、「永和」、「永康」、「忠孝東路」、「中華路」等字詞的影像特徵，因此當處理器140根據第一字詞資料組112辨識目標影像TIMG1中對應於目標區塊T1內的文字影像時，即不會將目標影像TIMG1中對應於目標區塊T1內的文字影像特徵與其他種類的字詞的影像特徵相比較，因此能夠減少運算資源。此外，在將目標影像TIMG1中對應於目標區塊T1內的文字影像特徵與第一字詞資料組112內的字詞影像特徵相比較時，可能會比對出複數個可能的候選字，例如「合」和「台」，此時若接著辨識下一個文字影像並比對出可能的候選字為「北」和「兆」，則由於第一字詞資料組112有包含「台北」的詞，因此「台」和「北」的權重就會被提高，此時處理器140就可能會優先選擇權重較高的字，亦即「台」及「北」。由於第一字詞資料組112中的字詞都與地址有關，因此在選擇可能的候選字時，文字影像辨識系統100能夠精準地判斷出目標區塊T1內的文字影像為何，並減少誤判成其他類別字詞的情況。In this case, the first word data group 112 may include image feature data of a plurality of words related to the address obtained according to the above manner, such as "Taipei", "Kaohsiung", "Yonghe", "Yongkang" The image features of the words "Zhongxiao East Road" and "Zhonghua Road", so when the processor 140 recognizes the text image corresponding to the target block T1 in the target image TIMG1 according to the first word data set 112, By comparing the character image features corresponding to the target block T1 in the target image TIMG1 with the image features of other types of words, the computing resources can be reduced. In addition, when the character image feature corresponding to the target image block T1 in the target image TIMG1 is compared with the word image feature in the first word data group 112, a plurality of possible candidate words may be compared, for example, "He" and "Taiwan", if the next character image is recognized and the possible candidates are "North" and "Mega", the first word data group 112 has the word "Taipei". Therefore, the weights of "Taiwan" and "North" will be improved. At this time, the processor 140 may preferentially select the words with higher weights, that is, "Taiwan" and "North". Since the words in the first word data group 112 are all related to the address, when selecting possible candidate words, the character image recognition system 100 can accurately determine the text image in the target block T1 and reduce the misjudgment. The case of other categories of words.

在本發明的其他實施例中，若某一特定社區欲利用文字影像辨識系統100來處理的該特定社區所接收到的郵件及包裹時，第一字詞資料組112也可僅包含與該特定社區之住戶地址相關的複數個字詞資料，或是與該特定社區之住戶地址相關的複數個序號資料。舉例來說，若其特定社區的地址為東一街50號至54號及北二路22至24號，住戶的樓層為1樓至10樓，則第一字詞資料組112可能會包含「東一街」、「北二路」、「50」、「52」、「54」、「22」、「24」、「3樓」、「五樓」…等字詞的影像特徵，而不會包含與該特定社區無關的其他路名或號碼，如「60」、「100」或「忠孝東路」…等。甚至在特定社區的住址僅限定於單一路段時，例如其地址為東一街50號至54號時，第一字詞資料組112亦可僅包含與特定社區之住戶地址相關的複數個序號的影像資料，例如「50」、「52」及「54」等序號的影像資料。如此一來，當處理器140將目標區塊T1內的文字影像與第一字詞資料組112中的字詞進行比對以調整加權時，就能夠更加精準地判斷出目標區塊T1內的文字影像為何，並減少誤判成其他類別字詞的情況。In other embodiments of the present invention, if a particular community wants to utilize the text image recognition system 100 to process the mail and package received by the particular community, the first word data set 112 may also include only that particular The plural word data associated with the household address of the community, or a plurality of serial number data associated with the household address of the particular community. For example, if the address of a particular community is from 50 to 54 Dong Yi Street and 22 to 24 North 2nd Road, and the floor of the household is from 1st to 10th floor, the first word data set 112 may contain " Image features of words such as "One Street", "North Second Road", "50", "52", "54", "22", "24", "3rd Floor" and "5th Floor"... It will include other road names or numbers that are not related to that particular community, such as "60", "100" or "Zhongxiao East Road"...etc. Even when the address of a particular community is limited to a single road segment, for example, when the address is from No. 50 to No. 54 of Dongyi Street, the first word data group 112 may only contain a plurality of serial numbers related to the household address of the specific community. Image data such as "50", "52" and "54". In this way, when the processor 140 compares the character image in the target block T1 with the words in the first word data group 112 to adjust the weight, it is possible to more accurately determine the target block T1. What is the text image and reduce the misjudgment into other categories of words.

在本發明的部分實施例中，由於目標影像TIMG1中對應於目標區塊T1外的文字影像並沒有特定的內容，因此第二字詞資料組可為標準常用之複數個字詞，而資料處理器140則可根據第二字詞資料組114來辨識目標影像TIMG1中對應於目標區塊T1外的文字影像。In some embodiments of the present invention, since the text image corresponding to the target block T1 in the target image TIMG1 has no specific content, the second word data group can be a plurality of words commonly used in the standard, and the data processing The device 140 can identify the text image corresponding to the target block T1 in the target image TIMG1 according to the second word data group 114.

此外，本發明並不限定於第一字詞資料組112是為與地址相關的複數個字詞資料及第二字詞資料組114是為標準常用之複數個字詞資料，在本發明的其他實施例中，使用者欲處理的特定區塊也可能包含其他的內容，此時第一字詞資料組112亦可能為與其內容所相關的字詞資料。舉例來說，若使用者欲處理的特定區塊是病人的病歷資料，則第一字詞資料組112則可為與疾病相關的複數個字詞資料，而若特定區塊外部為病人的一般資料，則第二字詞資料組114仍可為標準常用之複數個字詞資料，抑或是與姓名、血型、身高等一般基本資料有關的複數個字詞資料，如此一來，即能夠有效率且精準地辨識出目標影像中的文字。In addition, the present invention is not limited to the first word data group 112 being a plurality of word materials related to an address and the second word data group 114 is a plurality of word materials commonly used as standards, and the other in the present invention. In an embodiment, the specific block that the user wants to process may also contain other content, and the first word data group 112 may also be the word data related to the content. For example, if the specific block to be processed by the user is the patient's medical record data, the first word data set 112 may be a plurality of word data related to the disease, and if the specific block is external to the patient, For the data, the second word data group 114 can still be a plurality of word materials commonly used in the standard, or a plurality of word materials related to general basic materials such as name, blood type, height, etc., so that the data can be efficient. And accurately identify the text in the target image.

第4圖為本發明一實施例之辨識文字影像的方法200的流程圖。方法200可包含但不限於步驟S210至S260，並可應用於文字影像辨識系統100。FIG. 4 is a flow chart of a method 200 for recognizing a text image according to an embodiment of the present invention. Method 200 can include, but is not limited to, steps S210 through S260, and can be applied to text image recognition system 100.

S210：影像擷取裝置120擷取目標影像TIMG1；S210: The image capturing device 120 captures the target image TIMG1;

S220：顯示螢幕130顯示目標影像TIMG1；S220: The display screen 130 displays the target image TIMG1;

S230：處理器140使顯示螢幕130於目標影像TIMG1中標記出目標區塊T1；S230: The processor 140 causes the display screen 130 to mark the target block T1 in the target image TIMG1;

S240：處理器140根據第一字詞資料組112辨識目標影像TIMG1中對應於目標區塊T1內的文字影像內容；S240: The processor 140 identifies, according to the first word data group 112, the text image content corresponding to the target image block T1 in the target image TIMG1;

S250：處理器140根據第二字詞資料組114辨識目標影像TIMG1對應於目標區塊T1外的文字影像內容。S250: The processor 140 identifies, according to the second word data group 114, the target image TIMG1 corresponding to the text image content outside the target block T1.

在步驟S210中，影像擷取裝置120會擷取目標影像TIMG1，而在步驟S220中顯示螢幕130則可顯示影像擷取裝置120所擷取之目標影像TIMG1。In step S210, the image capturing device 120 captures the target image TIMG1, and in step S220, the display screen 130 displays the target image TIMG1 captured by the image capturing device 120.

接著在步驟S230中，處理器可使顯示螢幕130於目標影像TIMG1中標記出目標區塊T1。在本發明的部分實施例中，目標區塊的形狀的可預設為狹長型以配合即時預覽影像中的特定區塊，例如為郵務資料表單中的收件人地址欄位A1。再者，在本發明的部分實施例中，顯示螢幕130可為觸控螢幕，在此情況下，方法200還可包含根據使用者於顯示螢幕130的觸控操作調整目標區塊T1的大小的步驟，以方便使用者能夠調整目標區塊T1並使目標區塊T1能夠包含目標影像TIMG1中的特定區塊。Next, in step S230, the processor may cause the display screen 130 to mark the target block T1 in the target image TIMG1. In some embodiments of the present invention, the shape of the target block may be preset to be narrow and long to match a specific block in the instant preview image, such as the recipient address field A1 in the postal data form. In addition, in some embodiments of the present invention, the display screen 130 may be a touch screen. In this case, the method 200 may further include adjusting the size of the target block T1 according to a touch operation of the user on the display screen 130. The step is to facilitate the user to adjust the target block T1 and enable the target block T1 to include a specific block in the target image TIMG1.

在本發明的部分實施例中，顯示螢幕130亦可根據影像擷取裝置120所擷取的影像資料顯示複數個即時預覽影像，而使用者即可根據顯示螢幕130所顯示的即時預覽影像來調整影像擷取裝置120的擷取角度及/或目標區塊T1的大小及/或位置，並在調整完成後，使處理器140執行拍攝影像的功能，此時拍攝取得的影像即為目標影像。接著，在步驟S240及S250中，處理器可根據第一字詞資料組112及第二字詞資料組114來分別辨識目標影像TIMG1對應於目標區塊T1內及外的文字影像內容，由於第一字詞資料組112所包含的字詞資料會與目標影像TIMG1對應於目標區塊T1內的文字內容相關，因此方法200能夠精準地判斷出目標區塊T1內的文字影像為何，並減少誤判成其他類別字詞的情況。In some embodiments of the present invention, the display screen 130 can also display a plurality of instant preview images according to the image data captured by the image capturing device 120, and the user can adjust according to the instant preview image displayed on the display screen 130. The image capture device 120 captures the angle and/or the size and/or position of the target block T1, and after the adjustment is completed, causes the processor 140 to perform a function of capturing an image, and the captured image is the target image. Then, in steps S240 and S250, the processor can identify, according to the first word data group 112 and the second word data group 114, the text image content corresponding to the target image TIMG1 in the target block T1, respectively. The word data included in the word data group 112 is related to the text content in the target block T1 corresponding to the target image TIMG1, so the method 200 can accurately determine the text image in the target block T1 and reduce false positives. In the case of other categories of words.

綜上所述，本發明之實施例所提供之文字影像辨識系統及辨識文字影像的方法可以讓使用者在擷取欲進行辨識的目標影像時，標記出目標區塊以供使用者對準即時預覽影像中的特定區域，並在確定選擇目標影像後，以相異的字詞資料組來分別辨識目標影像中對應於目標區塊內及外的影像內容。由於字詞資料組中的字詞可限定於與目標區塊內的內容相關，因此本發明之實施例所提供之文字影像辨識系統及辨識文字影像的方法能夠提高文字影像的辨識準確率，並可減少將文字影像誤判成與影像內容無關的字詞的情況。以上所述僅為本發明之較佳實施例，凡依本發明申請專利範圍所做之均等變化與修飾，皆應屬本發明之涵蓋範圍。In summary, the text image recognition system and the method for recognizing text images provided by the embodiments of the present invention allow the user to mark the target block for the user to align when the target image to be recognized is captured. Preview a specific area in the image, and after determining the selected target image, identify the image content in the target image corresponding to the inside and outside of the target block by using different word data groups. Since the words in the word data group can be limited to be related to the content in the target block, the character image recognition system and the method for recognizing the character image provided by the embodiments of the present invention can improve the recognition accuracy of the text image, and It is possible to reduce the case where a character image is mistakenly judged as a word unrelated to the video content. The above are only the preferred embodiments of the present invention, and all changes and modifications made to the scope of the present invention should be within the scope of the present invention.

100 文字影像辨識系統 110 字詞資料庫 112 第一字詞資料組 114 第二字詞資料組 120 影像擷取裝置 130 顯示螢幕 140 處理器 IMG1 即時預覽影像 A1 收件人地址欄位 T1 目標區塊 TIMG1 目標影像 200 方法 S210至S250 步驟100 text image recognition system 110 word database 112 first word data group 114 second word data group 120 image capturing device 130 display screen 140 processor IMG1 instant preview image A1 recipient address field T1 target block TIMG1 Target image 200 method S210 to S250 steps

第1圖為本發明一實施例之文字影像辨識系統的示意圖第2圖為本發明一實施例之顯示螢幕的顯示畫面。第3圖為本發明一實施例之處理器所選取之目標影像。第4圖為本發明一實施例之辨識文字影像的方法的流程圖。1 is a schematic diagram of a character image recognition system according to an embodiment of the present invention. FIG. 2 is a display screen of a display screen according to an embodiment of the present invention. FIG. 3 is a target image selected by a processor according to an embodiment of the present invention. FIG. 4 is a flow chart of a method for recognizing a character image according to an embodiment of the present invention.

200 方法 S210至S250 步驟200 Method S210 to S250 Steps

Claims

A text image recognition system includes: a word database for storing a first word data group and a second word data group; an image capturing device for capturing a target image; and a display screen The image capturing device is coupled to the image capturing device for displaying the target image; and a processor coupled to the image capturing device, the display screen and the word database, the processor is configured to: display the display screen Marking a target block in the target image; identifying a text image corresponding to the target block in the target image according to the first word data group; and identifying the target image according to the second word data group Corresponding to the text image outside the target block; wherein the first word data group and the second word data group are different word data groups; and the first word data group includes a specific community The plurality of word data related to the household address, and the processor identifies, according to the first word data group, a receiving address in the target image corresponding to the target block.

The character image recognition system of claim 1, wherein the first word data group further comprises a plurality of word materials related to the address.

The character image recognition system of claim 1, wherein the first word data group further comprises a plurality of serial number data related to a household address of a specific community.

The character image recognition system of claim 1, wherein the second word data group is a standard And a plurality of word data that is commonly used, and the processor identifies, according to the second word data group, a mail related material corresponding to the target block in the target image.

The character image recognition system of any one of claims 1 to 4, wherein the display screen is a touch screen.

The character image recognition system of claim 5, wherein the processor adjusts the position of the target block according to a touch operation of the display screen by the user.

The character image recognition system of claim 5, wherein the processor is further configured to adjust a size of the target block according to a touch operation of the display screen by the user.

The character image recognition system of any one of claims 1 to 4, wherein the processor is further configured to store the target image to a memory.

A method for recognizing a text image, comprising: an image capturing device capturing a target image; a display screen displaying the target image; and a processor causing the display screen to mark a target block in the target image; the processor Identifying, according to a first word data group, text image content corresponding to the target image in the target image; and the processor identifying, according to a second word data group, the target image corresponding to the text outside the target block Image content; wherein the first word data group and the second word data group are different word data groups; and The first word data group includes a plurality of word data related to a household address of a specific community, and the processor identifies, according to the first word data group, the target image corresponding to the target block. A receiving address.

The method of claim 9, wherein the first word data set further comprises a plurality of word materials associated with the address.

The method of claim 9, wherein the first word data set further comprises a plurality of serial number data associated with a household address of a particular community.

The method of claim 9, wherein the second word data group is a plurality of word materials commonly used by the standard, and the processor identifies the target image corresponding to the target according to the second word data group. A mail-related material outside the block.

The method of any one of claims 9 to 12, wherein the display screen is a touch screen; and the processor adjusts the position of the target block according to a touch operation of the display screen by the user.

The method of any one of claims 9 to 12, wherein: the display screen is a touch screen; and the method further comprises the processor adjusting the target block according to a touch operation of the user on the display screen the size of.

The method of any one of claims 9 to 12, further comprising the processor storing the target image Save to a memory.