TW202319961A

TW202319961A - Building method for key-item detection model, identification system and method for business-oriented key-item especially for identifying a plurality of document image files according to business sub-models

Info

Publication number: TW202319961A
Application number: TW110140778A
Authority: TW
Inventors: 劉邦旭; 李藝鋒; 宋政隆; 王俊權
Original assignee: 中國信託商業銀行股份有限公司
Priority date: 2021-11-02
Filing date: 2021-11-02
Publication date: 2023-05-16
Also published as: TWI807467B

Abstract

A building method for key-item detection model is disclosed, comprising the following steps: receiving a plurality of document image files for training uses; receiving marks for a plurality of business types for each document image file in order to form business mark files corresponding to the document image files, respectively; and inputting the document image files and the business mark files into a neural network system for training, forming a business sub-model for each business type training; and finally forming an key-item detection model including a plurality of business sub-models.

Description

Key-item detection model building method, business-oriented key-value identification system and method

本發明是有關於一種文件辨識方法，尤其是指一種依據業務種類對文件影像進行辨識的方法。The invention relates to a document identification method, in particular to a method for identifying document images according to business types.

光學字元辨識(Optical Character Recognition，簡稱OCR)技術可針對文件影像進行分析辨識處理，主要包括「文字區域偵測(text detection)」以及「文字辨識(text recognition)」兩大步驟。其中，「文字區域偵測」係針對指定頁或整份文件進行偵測，「文字辨識」則是對偵測到的文字區域進行字元切割以及字元辨識等等。Optical Character Recognition (OCR) technology can analyze and recognize document images. It mainly includes two steps: "text detection" and "text recognition". Among them, "text area detection" is to detect the specified page or the entire document, and "text recognition" is to perform character segmentation and character recognition on the detected text area.

參閱圖1，當OCR應用來辨識表單文件，會先偵測出文字區域(例如是圖中粗框部分)，接著辨識當中的文字。現有技術中，有些應用程式可依據預設的規則(例如框的距離)判斷出項目，例如「金融機構名稱」、「銀行代碼」及「金融機構存款帳號」等項目，並依據預定的關係取得對應資料，例如判斷各該項目的下方是否有對應的文字或數字，若有則進行辨識。最後，將辨識結果以預設規則(例如「項目」+「：」+「數字」)輸出，例如「銀行代號：456」，「金融機構存款帳號: 78910123456789」。Referring to Figure 1, when the OCR application is used to identify form documents, it will first detect the text area (such as the thick frame in the figure), and then recognize the text in it. In the prior art, some applications can judge items according to preset rules (such as the distance between boxes), such as "financial institution name", "bank code" and "financial institution deposit account number", and obtain them according to the predetermined relationship Corresponding information, such as judging whether there is a corresponding text or number under each item, and if so, identifying it. Finally, the identification result is output with a preset rule (such as "item"+":"+"number"), such as "bank code: 456", "financial institution deposit account number: 78910123456789".

因此，本發明之目的，在於提供一種要項偵測模型建立方法，包括以下步驟：接收多筆訓練用文件影像檔案、對於每一文件影像檔案接收針對複數個業務種類所作的標記以分別形成與該文件影像檔案對應的業務標記檔案，以及使該等文件影像檔案、該等業務標記檔案輸入一神經網路系統進行訓練，針對每一業務種類訓練形成一業務子模型；最終形成包括複數個業務子模型的一要項偵測模型。Therefore, the object of the present invention is to provide a method for establishing an important item detection model, which includes the following steps: receiving a plurality of document image files for training; The business tag files corresponding to the document image files, and inputting the file image files and the business tag files into a neural network system for training, training for each business type to form a business sub-model; finally forming a plurality of business sub-models. A key item of the model is the detection model.

在本發明要項偵測模型建立方法的一些實施態樣中，該接收標記的步驟係針對各該業務種類分別建立一個業務標記檔案資料夾，各該業務標記檔案資料夾儲存該等業務標記檔案。In some implementation aspects of the method for establishing the key item detection model of the present invention, the step of receiving the mark is to create a business mark file folder for each business type, and each business mark file folder stores the business mark files.

在本發明要項偵測模型建立方法的一些實施態樣中，該接收標記的步驟，係藉由對各該文件影像檔案進行以下操作而達成：紀錄一要項鍵標記名稱並框選一要項鍵框以標記要項鍵、紀錄一要項值標記名稱並框選一要項值框以標記要項值，以及紀錄一邊界框名稱並框選涵蓋該要項鍵框與該要項值框的一邊界框。In some implementation aspects of the method for building an important item detection model of the present invention, the step of receiving a mark is achieved by performing the following operations on each of the document image files: recording a key mark name of an important item and selecting a key frame of an important item Mark the key, record a key value mark name and check a key value box to mark the key value, and record a bounding box name and check a bounding box covering the key key box and the key value box.

在本發明要項偵測模型建立方法的一些實施態樣中，各該業務標記檔案紀錄了至少一組的要項鍵標記名稱及要項鍵框的座標資料、要項值標記名稱及要項值框的座標資料，以及邊界框名稱與涵蓋該要項鍵框、要項值框的邊界框的座標資料。In some implementation aspects of the key item detection model building method of the present invention, each of the business tag files records at least one set of key item key tag names and coordinate data of key item key boxes, key item value tag names and coordinate data of key item value boxes , and the name of the bounding box and the coordinate data of the bounding box covering the key box and value box of the key item.

本發明之另一目的在於提供一種業務導向要項鍵值辨識系統，包括一處理器、一與該處理器電連接的電腦可讀媒體，以及利用前述方法所建立的要項偵測模型，該要項偵測模型用於針對一輸入的標的文件影像檔案，按照業務需求偵測出至少一要項鍵以及其對應的要項值。Another object of the present invention is to provide a business-oriented key-value identification system, including a processor, a computer-readable medium electrically connected to the processor, and the key-item detection model established by the aforementioned method, the key-item detection The detection model is used to detect at least one important item key and its corresponding important item value according to business requirements for an input target document image file.

在本發明業務導向要項鍵值辨識系統的一些實施態樣中，該業務導向要項鍵值辨識系統還包含一光學字元辨識模型，接收來自該要項偵測模型的落在同一個邊界框內的該要項鍵框內之影像以及該要項值框內之影像並進行字元辨識後輸出。In some implementation aspects of the business-oriented key-value identification system of the present invention, the business-oriented key-value identification system further includes an optical character recognition model that receives characters falling within the same bounding box from the key-item detection model The image in the key box of the key item and the image in the value box of the key item are output after character recognition.

在本發明業務導向要項鍵值辨識系統的一些實施態樣中，該電腦可讀媒體儲存有該要項偵測模型。In some implementation aspects of the business-oriented key-value identification system of the present invention, the computer-readable medium stores the key-item detection model.

本發明之再一目的，在於提供一種業務導向要項鍵值辨識方法。該方法包括：接收一標的文件影像檔案；接收一業務需求的選項輸入；依照該業務需求套用一要項偵測模型中複數個子模型中的一個對應該業務需求的子模型，該子模型已預先依照業務需求完成訓練；該子模型對該標的文件影像檔案進行偵測，從該標的文件影像檔案中偵測出至少一要項鍵影像、其對應的要項值影像，以及給定一偵測邊界框；以及判斷該要項鍵影像、要項值影像是否落在該偵測邊界框內，若是，則將該要項鍵影像、要項值影像帶入一光學字元辨識模組中，得到一辨識結果並輸出。Another object of the present invention is to provide a method for identifying key values of business-oriented key items. The method includes: receiving a target document image file; receiving an option input of a business requirement; applying a sub-model corresponding to the business requirement among a plurality of sub-models in an important item detection model according to the business requirement, and the sub-model has been pre-according to The business requirement completes the training; the sub-model detects the target document image file, and detects at least one important item key image, its corresponding important item value image, and a detection bounding box from the target document image file; And judging whether the important item key image and important item value image fall within the detection bounding box, if so, bring the important item key image and important item value image into an optical character recognition module to obtain a recognition result and output it.

在本發明業務導向要項鍵值辨識方法的一些實施態樣中，該子模型對該標的文件影像檔案進行偵測的步驟，是偵測該標的文件影像檔案中的要項鍵影像並給定一個圍繞該要項鍵影像周圍的一偵測要項鍵框、偵測要項值影像並給定一個圍繞該要項值影像周圍的一偵測要項值框，以及按照訓練結果給定該偵測邊界框。In some implementation aspects of the business-oriented important item key value identification method of the present invention, the step of detecting the target document image file by the sub-model is to detect the key item key image in the target document image file and give a surrounding A detection key frame around the key image, a key value image is detected and a frame value frame around the key value image is given, and the detection bounding box is given according to the training result.

在本發明業務導向要項鍵值辨識方法的一些實施態樣中，該判斷該要項鍵、要項值所在範圍是否落在該邊界框內的步驟，是判定該偵測要項鍵框、偵測要項值框是否落在該偵測邊界框內。In some implementation aspects of the business-oriented important item key value identification method of the present invention, the step of judging whether the range of the important item key and important item value falls within the bounding box is to determine whether the detected important item key box, detected important item value Whether the box falls within this detection bounding box.

本發明之功效在於：可根據業務需求辨識要項鍵及對應的要項值，毋須辨識整份文件影像，大幅減輕硬體負擔以及降低時間成本。The effect of the present invention is that it can identify key items and corresponding key item values according to business requirements without identifying the entire document image, greatly reducing the burden on hardware and reducing time costs.

在本發明被詳細描述之前，應當注意在以下的說明內容中，類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that in the following description, similar elements are denoted by the same numerals.

參閱圖2，本發明業務導向要項鍵值辨識方法的一實施例可根據業務需求辨識要項及對應的鍵值。該實施例可藉由一業務導向要項鍵值辨識系統100執行，該系統100是由一處理器91以及儲存有程式指令且與該處理器91電連接的電腦可讀媒體92來實現，當處理器91執行指令時組配來執行業務導向要項鍵值辨識方法，並透過與該處理器91電連接的輸出裝置93輸出辨識結果。在其他實施例，也可以是利用例如場域可編程邏輯閘陣列(field-programmable gate array，簡稱FPGA) 、微型處理器(micro processor)或系統單晶片(system on chip) 等硬體或韌體來實現，並且可採用單一裝置或分散式裝置來執行功能。Referring to FIG. 2 , an embodiment of the business-oriented key-value identification method of the present invention can identify key items and corresponding key-values according to business requirements. This embodiment can be implemented by a business-oriented key-value identification system 100. The system 100 is implemented by a processor 91 and a computer-readable medium 92 that stores program instructions and is electrically connected to the processor 91. When processing When the processor 91 executes the instruction, it is configured to execute the business-oriented key-value identification method, and output the identification result through the output device 93 electrically connected to the processor 91 . In other embodiments, hardware or firmware such as a field-programmable gate array (FPGA), a micro processor, or a system on chip may also be used. and may be implemented by a single device or by distributed devices.

參閱圖3，本發明業務導向要項鍵值辨識方法的實施例包括步驟S21~S28，且在執行該業務導向要項鍵值辨識方法前，需預先建立一要項偵測模型10。該要項偵測模型10的建立方法可以如圖4所示，包括以下步驟。Referring to FIG. 3 , the embodiment of the business-oriented key-value identification method of the present invention includes steps S21-S28, and before executing the business-oriented key-value identification method, an important item detection model 10 needs to be established in advance. The method for establishing the important item detection model 10 can be shown in FIG. 4 , including the following steps.

步驟S11—接收多筆訓練用文件影像檔案。該等文件影像檔案例如銀行保險業者所使用的各種業務申請書、授權書、等文件的掃描檔案，或是以編輯軟體加入數位手寫輸入的文件檔案。Step S11—receiving multiple training document image files. Such document image files are, for example, scanned files of various business application forms, authorization letters, and other documents used by banking and insurance companies, or document files added with digital handwriting input by editing software.

步驟S12—對於各該文件影像檔案，接收針對複數個業務種類分別作的標記(lable)。下文中，該等業務種類以「第一業務」及「第二業務」舉例說明，其中「第一業務」例如為「金融業務」，「第二業務」例如為「壽險業務」，但不以此為限。Step S12—for each of the document image files, receiving labels for a plurality of business types respectively. In the following, these types of business are illustrated by "first business" and "second business", where "first business" is for example "financial business" and "second business" is "life insurance business", but not in This is the limit.

本步驟具體執行方式，可以是針對各種業務分別建立一個「業務標記檔案資料夾」，並儲存與文件影像檔案一對一對應的業務標記檔案。在本實施例，進行標記的操作者使用一自行開發的標記軟體進行標記，可以先建立一包含該等文件影像檔案的影像資料夾，並且預設好一「金融業務」業務標記檔案資料夾及一「壽險業務」業務標記檔案資料夾。接著在標記應用程式介面中設定好資料夾路徑、輸入要項鍵標記名稱之後，即可選擇該影像資料夾中的文件影像檔案逐一進行標記。進行標記的具體步驟包括S121標記要項鍵、步驟S122標記要項值，以及步驟S123形成邊界框。在其他實施例，進行標記的操作者可使用例如LabelImg應用程式來進行標記。The specific execution method of this step may be to create a "business mark file folder" for each business, and store the business mark files corresponding to the document image files one-to-one. In this embodiment, the marking operator uses a self-developed marking software to mark, and can first create an image folder containing the image files of these documents, and preset a "financial business" business marking file folder and 1. "Life insurance business" business mark file folder. Then, after setting the folder path in the marking application program interface and entering the key mark name, you can select the document image files in the image folder to mark one by one. The specific steps of marking include S121 marking key item keys, step S122 marking key item values, and step S123 forming a bounding box. In other embodiments, the labeling operator may use, for example, a LabelImg application to perform the labeling.

配合參閱圖5，以「壽險業務」來說，步驟S121例如標記影像中一個要項「要保人簽名」，首先設定一要項鍵標記名稱「sig_applicant」，接著在影像上有「要保人簽名」處框選一要項鍵框51，標記應用程式則連同該要項鍵標記名稱記錄該要項鍵框51的座標資料，儲存在一業務標記檔案中。若影像中多處出現「要保人簽名」，操作者就要框選出多個矩形框。前述業務標記檔案依據所設定的資料夾路徑儲存於該「壽險業務」業務標記檔案資料夾內，並且與該文件影像檔案為一對一對應，格式例如為xml或者txt文字檔。該要項鍵框51的座標資料可以是四個角之座標，也可以是矩形框51的中心點座標以及其長度與寬度或其他形式。Referring to Fig. 5, taking "life insurance business" as an example, step S121 marks an important item "signature of proposer" in the image for example, first sets an important item key tag name "sig_applicant", and then has "signature of proposer" on the image Select an important item key frame 51, and the mark application program records the coordinate data of the important item key frame 51 together with the important item key mark name, and stores it in a business mark file. If the "signature of the proposer" appears in multiple places in the image, the operator needs to select multiple rectangles. The aforementioned business mark file is stored in the "life insurance business" business mark file folder according to the set folder path, and is in one-to-one correspondence with the document image file, and the format is, for example, xml or txt text file. The coordinate data of the main item key frame 51 can be the coordinates of the four corners, or the coordinates of the center point of the rectangular frame 51 and its length and width or other forms.

接著進行步驟S122，標記該要項鍵對應的要項值。須先說明的是，本發明定義「要項『值』」係泛指書表填寫內容，並不以數值為限。繼續以圖5舉例來說，影像右上角的「要保人簽名」下方空白處，即為預設的要保人簽名處，也就是「要項值」的位置。本步驟例如設定一要項值標記名稱為「sig_applicant_val」，然後操作者在「要保人簽名」下方空白處框選出一要項值框52，標記應用程式則連同該要項值標記名稱記錄該要項值框52的座標資料，儲存在同一個業務標記檔案中。Then proceed to step S122, mark the key item value corresponding to the key item key. It must be explained first that the definition of "key item "value"" in the present invention generally refers to the contents of the form, and is not limited to numerical values. Continuing to take Figure 5 as an example, the blank space below the "Proposer's Signature" in the upper right corner of the image is the default position of the Proposer's signature, that is, the position of the "Key Item Value". In this step, for example, the name of a key value tag is set as "sig_applicant_val", and then the operator selects a key value box 52 in the blank space below "Proposer's Signature", and the marking application program records the key value box together with the key value tag name 52 coordinate data are stored in the same business mark file.

在步驟S123，進一步取得一「要保人簽名」邊界框(bounding box)53，並將該「要保人簽名」邊界框53紀錄於該業務標記檔案中。具體方式例如將該要項鍵的座標資料與該要項值的座標資料綜合計算得到最大矩形框，作為該「要保人簽名」邊界框(bounding box)53；或者，由操作者設定一邊界框名稱為「sig_applicant_bb」，然後操作者自行框選出涵蓋該要項鍵框51、要項值框52的一邊界框53，標記應用程式則將該邊界框名稱與該邊界框座標資料，共同儲存在同一個業務標記檔案中。In step S123, a "proposer's signature" bounding box (bounding box) 53 is further obtained, and the "proposer's signature" bounding box 53 is recorded in the business mark file. The specific method is, for example, to comprehensively calculate the coordinate data of the important item key and the coordinate data of the important item value to obtain the largest rectangular frame as the bounding box (bounding box) 53 of the "signature of the proposer"; or, the operator sets a bounding box name is "sig_applicant_bb", and then the operator selects a bounding box 53 covering the important item key box 51 and important item value box 52, and the marking application program stores the name of the bounding box and the coordinate data of the bounding box together in the same business tag file.

依此類推，要在同一影像標記「保單號碼」時，先設定要項鍵標記名稱為「policy_no.」並在影像上有「保單號碼」處進行框選(步驟S121)，接著設定要項值標記名稱為「policy_no._val」並在影像上保單號碼下方表格處框選(步驟S122)，最後形成「保單號碼」邊界框。完成後，該同一個業務標記檔案即進一步紀錄了要項鍵標記名稱「policy_no.」的要項鍵框座標資料、要項值標記名稱「policy_no._val」的要項鍵框座標資料，以及其邊界框名稱與座標資料。By analogy, when marking the "policy number" on the same image, first set the important item key tag name as "policy_no." and make a frame selection on the image with "policy number" (step S121), and then set the important item value tag name Select "policy_no._val" and select a frame at the table below the policy number on the image (step S122), and finally form the bounding box of "policy number". After completion, the same business tag file further records the key frame coordinate data of the key tag name "policy_no.", the key frame coordinate data of the value tag name "policy_no._val", and its bounding box name and Coordinate data.

如此一來，假設要訓練一百份文件影像檔案，則須針對所有業務種類分別進行標記，例如就「金融業務」進行標記而在「金融業務標記資料夾」產生一百個業務標記檔案，就「壽險業務」進行標記而在「壽險業務標記資料夾」產生一百個業務標記檔案。也就是說，每一個文件影像檔案都有對應的業務標記檔案，每一個業務標記檔案包括多組要項鍵標記名稱與座標資料、要項值標記名稱與座標資料，以及邊界框名稱與座標資料。In this way, assuming that one hundred document image files are to be trained, all business types must be marked separately. "Life insurance business" is marked and one hundred business mark files are generated in the "life insurance business mark folder". That is to say, each document image file has a corresponding business tag file, and each business tag file includes multiple sets of important item key tag names and coordinate data, important item value tag names and coordinate data, and bounding box names and coordinate data.

步驟S13—使該等文件影像檔案、業務標記檔案輸入一神經網路系統進行訓練，定義該訓練完成的神經網路為該要項偵測模型10，該要項偵測模型10包含複數個業務子模型。Step S13—Enter these document image files and business mark files into a neural network system for training, define the trained neural network as the important item detection model 10, and the important item detection model 10 includes a plurality of business sub-models .

本步驟具體來說，可以先建立設定檔(configuration file, cfg檔)資料夾，該設定檔內容可以包括業務種類(例如「金融業務」或「壽險業務」)、標記列表(例如要項鍵標記名稱「sig_applicant」、「policy_no.」)、影像檔列表(檔名)、一預設的權重值資料夾及其路徑、批次大小(batch size)等等。接著，配合參閱圖2，以訓練「金融業務」用之子模型來說，使一神經網路(例如採用神經網路Darknet)按照該設定檔的設定，讀取訓練用的所有文件影像檔案及其對應的「金融業務」標記資料夾中的標記資料進行訓練，訓練完成後建立一金融業務子模型101。訓練完成的該金融業務子模型101用於從輸入的文件影像檔案中偵測出例如「金融機構代號」等要項鍵以及其對應的「要項值」。Specifically, in this step, a configuration file (configuration file, cfg file) folder can be created first, and the content of the configuration file can include business types (such as "financial business" or "life insurance business"), tag lists (such as key tag names "sig_applicant", "policy_no."), image file list (file name), a default weight value folder and its path, batch size (batch size), etc. Next, with reference to Figure 2, for training the sub-model used for "financial business", make a neural network (for example, using the neural network Darknet) read all the training document image files and their The marked data in the corresponding "financial business" marked folder is used for training, and a financial business sub-model 101 is established after the training is completed. The trained financial business sub-model 101 is used to detect key item keys such as "financial institution code" and its corresponding "key item value" from the input document image file.

以訓練「壽險業務」用之子模型來說，本步驟是使該神經網路按照該設定檔的設定，讀取訓練用的所有文件影像檔案及其對應的「壽險業務」標記資料夾中的標記資料進行訓練，訓練完成後建立一壽險業務子模型102。訓練完成的該壽險業務子模型102用於從輸入的文件影像檔案中偵測出例如「要保人簽名」等要項鍵以及其對應的「要項值」。Taking the sub-model for training "life insurance business" as an example, this step is to make the neural network read all the document image files used for training and the tags in the corresponding "life insurance business" tag folder according to the settings of the configuration file The data is used for training, and a life insurance business sub-model 102 is established after the training is completed. The trained life insurance business sub-model 102 is used to detect important item keys such as "signature of proposer" and its corresponding "important item value" from the input document image file.

當該要項偵測模型10建立完成，即可供該業務導向要項鍵值辨識系統100執行業務導向要項鍵值辨識方法使用。參閱圖2及圖3，首先，在步驟S21，該要項偵測模型10接收一標的文件影像檔案(圖未示，類似於圖5)。When the key-value detection model 10 is established, it can be used by the business-oriented key-value identification system 100 for executing the business-oriented key-value identification method. Referring to FIG. 2 and FIG. 3 , first, in step S21 , the key item detection model 10 receives a target document image file (not shown, similar to FIG. 5 ).

在步驟S22，該要項偵測模型10接收一業務需求的選項輸入，例如「金融業務」或「壽險業務」其中一種。In step S22, the key item detection model 10 receives an option input of a business requirement, such as one of "financial business" or "life insurance business".

在步驟S23，該要項偵測模型10依據步驟S22所接收的輸入選項，套用對應的子模型。具體來說，例如，當步驟S22所接收的輸入為「金融業務」則本步驟將該標的文件影像檔案輸入該金融業務子模型101；當步驟S22所接收的輸入為「壽險業務」則本步驟將該標的文件影像檔案輸入該壽險業務子模型102。下文以系統使用者為壽險業務人員、且步驟S22是接收該使用者操作所輸入的「壽險業務」選項來進行說明，在本步驟S23中，該標的文件影像檔案輸入該壽險業務子模型102。In step S23, the key item detection model 10 applies the corresponding sub-model according to the input options received in step S22. Specifically, for example, when the input received in step S22 is "financial business", this step inputs the target document image file into the financial business sub-model 101; when the input received in step S22 is "life insurance business", this step The target document image file is input into the life insurance business sub-model 102 . The following description assumes that the system user is a life insurance business person, and step S22 is to receive the "life insurance business" option input by the user operation. In this step S23, the target document image file is input into the life insurance business sub-model 102 .

在步驟S24，該壽險業務子模型102對該標的文件影像檔案進行偵測，依據訓練結果偵測影像中的要項鍵影像並給定一個圍繞該要項鍵影像周圍的偵測要項鍵框、偵測要項值影像並給定一個圍繞該要項值影像周圍的偵測要項值框，並且按照訓練結果給定一個偵測邊界框。In step S24, the life insurance business sub-model 102 detects the target document image file, detects the important item key image in the image according to the training result and gives a detection important item key frame around the important item key image, detects An important item value image and a detection important item value box around the important item value image is given, and a detection bounding box is given according to the training result.

在步驟S25，該壽險業務子模型102判定該偵測要項鍵框、偵測要項值框是否完全落在偵測邊界框內？若是，則判定偵測結果符合預期，接著進行步驟S26；若否，則推測所偵測到的要項鍵、要項值並非相關，結束流程。須說明的是，步驟S24與S25所描述之判定方式僅為其中一種舉例，該壽險業務子模型102對於偵測邊界框的給定方式可以加大10%以容納誤差範圍；判斷條件的設定，也可以是例如該偵測要項鍵框、偵測要項值框的範圍的80%以上落在該偵測邊界框內即可。In step S25, the life insurance business sub-model 102 determines whether the detection key item key box and the detection key item value box are completely within the detection bounding box? If yes, it is determined that the detection result is in line with expectations, and then proceed to step S26; if not, it is inferred that the detected key and value of the important item are not related, and the process ends. It should be noted that the judgment methods described in steps S24 and S25 are only one example, and the life insurance business sub-model 102 can increase the given method of detecting the bounding box by 10% to accommodate the error range; the setting of judgment conditions, It may also be that, for example, more than 80% of the range of the detection key item key box and the detection key item value box falls within the detection bounding box.

在步驟S26，將落在同一個偵測邊界框內的要項鍵影像以及要項值影像傳送到一光學字元辨識(Optical Character Recognition，簡稱OCR)模型20進行字元辨識，得到成對的辨識結果。例如辨識出中文字「保單號碼」以及一串手寫數字。In step S26, the key image and the value image of the key item falling within the same detection bounding box are sent to an optical character recognition (Optical Character Recognition, OCR) model 20 for character recognition, and a paired recognition result is obtained . For example, it recognizes the Chinese character "policy number" and a series of handwritten numbers.

最後，在步驟S27，該處理器91將該OCR模組20的辨識結果依據對應關係以及預設格式，透過該輸出裝置93進行輸出。例如當該OCR模組20辨識得到成對的文字「保單號碼」與一串數字(例如1234567890)，處理器91透過輸出裝置93輸出「保單號碼：1234567890」的結果。Finally, in step S27 , the processor 91 outputs the recognition result of the OCR module 20 through the output device 93 according to the corresponding relationship and the preset format. For example, when the OCR module 20 recognizes the paired text "policy number" and a series of numbers (for example, 1234567890), the processor 91 outputs the result of "policy number: 1234567890" through the output device 93 .

綜上所述，本發明應用人工智慧技術，對於各種文件影像按照業務類別進行要項鍵與要項值預先標記並訓練出該要項偵測模型10，使該要項偵測模型10能夠依據業務需求去偵測文件影像中所需的要項再進行OCR辨識。由於只偵測業務相關的要項，因此本發明可大幅改善傳統OCR文件辨識所耗費的業務處理時間。以圖1所示的「股東領取現金股利方式申請書」舉例來說，若針對甲部門訓練出「甲業務子模型」用來偵測「銀行代號」與「金融機構存款帳號」兩欄；針對乙部門訓練出「乙業務子模型」用來偵測「金融機構存款帳號」一欄。從實測結果發現，使用訓練好的「乙業務子模型」產生輸出結果的時間，比使用訓練好的「甲業務子模型」產生輸出結果的時間縮短了50%；與採用傳統OCR辨識整份文件比起來，更是縮短了90%的時間。再以圖5「保險費付款授權書」舉例來說，由於整份文件影像內容複雜，針對壽險公司訓練的「壽險業務子模型」只偵測要保人資料，比起傳統OCR辨識整份文件的方式，本發明效率提升十倍以上。由此可知，確實能達成本發明之目的。To sum up, the present invention uses artificial intelligence technology to pre-mark key items and key item values for various document images according to business categories and train the key item detection model 10, so that the key item detection model 10 can detect according to business needs. The required items in the test file image are then recognized by OCR. Since only important items related to the business are detected, the present invention can greatly improve the business processing time consumed by traditional OCR document recognition. Take the "Application Form for Shareholders to Receive Cash Dividends" shown in Figure 1 as an example, if the "A Business Sub-Model" is trained for Department A to detect the two columns of "Bank Code" and "Financial Institution Deposit Account Number"; Department B trained the "B business sub-model" to detect the "financial institution deposit account number" column. From the actual measurement results, it is found that using the trained "B business sub-model" to generate output results is 50% shorter than using the trained "A business sub-model" to generate output results; and using traditional OCR to identify the entire document In comparison, it shortens the time by 90%. Let’s take Figure 5 “Authorization Form for Insurance Premium Payment” as an example. Due to the complex image content of the entire document, the “Life Insurance Business Sub-Model” trained for life insurance companies only detects the applicant’s information, compared with traditional OCR to identify the entire document In this way, the efficiency of the present invention is increased by more than ten times. This shows that, can really reach the purpose of the present invention.

惟以上所述者，僅為本發明之實施例而已，當不能以此限定本發明實施之範圍，凡是依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾，皆仍屬本發明專利涵蓋之範圍內。But what is described above is only an embodiment of the present invention, and should not limit the scope of the present invention. All simple equivalent changes and modifications made according to the patent scope of the present invention and the content of the patent specification are still within the scope of the present invention. Within the scope covered by the patent of the present invention.

100:業務導向要項鍵值辨識系統 10:要項偵測模組 101:金融業務子模型 102:壽險業務子模型 20:OCR模型 91:處理器 92:電腦可讀媒體 93:輸出裝置 S11~S13:要項偵測模型建立方法之步驟 S121~S123:對文件影像檔案標記之步驟 S21~S26:業務導向要項鍵值辨識方法之步驟 51:要項鍵框 52:要項值框 53:邊界框100: business-oriented key-value identification system 10: key-item detection module 101: financial business sub-model 102: life insurance business sub-model 20: OCR model 91: processor 92: computer-readable media 93: output device S11~S13: Steps S121-S123 of the method for establishing an important item detection model: Steps S21-S26 of marking a document image file: Step 51 of a business-oriented key-value identification method: an important item key box 52: an important item value box 53: a bounding box

本發明之其他的特徵及功效，將於參照圖式的實施方式中清楚地呈現，其中：圖1是一文件影像檔案的示意圖；圖2是一方塊圖，說明本發明業務導向要項鍵值辨識系統的一實施例；圖3是一流程圖，說明本發明要項偵測模型建立方法的一實施例；圖4是一流程圖，說明本發明業務導向要項鍵值辨識方法的一實施例；及圖5是一訓練用文件影像檔案的示意圖。Other features and functions of the present invention will be clearly presented in the implementation manner with reference to the drawings, wherein: FIG. 1 is a schematic diagram of a document image file; FIG. 2 is a block diagram illustrating the key value identification of business-oriented key items in the present invention An embodiment of the system; FIG. 3 is a flow chart illustrating an embodiment of the method for establishing an important item detection model of the present invention; FIG. 4 is a flow chart illustrating an embodiment of the method for identifying business-oriented key items of the present invention; and FIG. 5 is a schematic diagram of a document image file for training.

100:業務導向要項鍵值辨識系統 100: Business-oriented key-value identification system

10:要項偵測模組 10: Key item detection module

101:金融業務子模型 101: Financial business sub-model

102:壽險業務子模型 102: Sub-model of life insurance business

20:OCR模型 20: OCR model

91:處理器 91: Processor

92:電腦可讀媒體 92:Computer-readable media

93:輸出裝置 93: output device

Claims

A method for establishing an important item detection model, comprising the following steps: Receive multiple training document image files; Receiving marks: For each document image file, receive marks made for multiple business categories, and respectively form business mark files corresponding to the document image file; and The document image files and the business mark files are input into a neural network system for training, and a business sub-model is formed according to the training for each business type; finally, an important item detection model including a plurality of business sub-models is formed.

The method for establishing an important item detection model as described in Claim 1, wherein the step of receiving the mark is to establish a business mark file folder for each business type, and each business mark file folder stores the business mark files.

The method for building an important item detection model as described in claim 1, wherein the step of receiving the mark is achieved by performing the following operations on each of the document image files: Record an important key tag name and check an important key box to mark the important key, record a key value tag name and check the key value box to mark the key value, and Record a bounding box name and select a bounding box covering the key box of the key item and the value box of the key item.

The method for establishing an important item detection model as described in claim 1, wherein each of the business tag files records at least one set of important item key tag names and coordinate data of the important item key frame, important item value tag names and coordinate data of the important item value frame, And the name of the bounding box and the coordinate data of the bounding box covering the key box and the value box of the key item.

A key-value identification system for business-oriented key items, including: a processor; a computer readable medium electrically connected to the processor; and An important item detection model, established by the method described in any one of claims 1 to 4, is used to detect at least one important item key and its corresponding important item value for an input target document image file according to business requirements.

The business-oriented key-value identification system described in claim 5 further includes: An optical character recognition model, which receives the image in the key box of the key item and the image in the value box of the important item falling in the same bounding box from the key item detection model, performs character recognition and then outputs.

In the business-oriented key-value identification system as claimed in claim 5, wherein the computer-readable medium stores the key-value detection model.

A business-oriented key-value identification method, comprising the following steps: Receive a target document image file; receiving an option input of a business requirement; According to the business requirement, one of the plurality of sub-models in the key item detection model is applied to a sub-model corresponding to the business requirement, and the sub-model has been pre-trained according to the business requirement; The sub-model detects the target document image file, detects at least one important item key image, its corresponding important item value image, and a detection bounding box from the target document image file; and Judging whether the important item key image and important item value image fall within the detection bounding box, if so, bringing the important item key image and important item value image into an optical character recognition module to obtain a recognition result and output it.

The business-oriented key item key value identification method as described in claim 8, wherein the step of detecting the target document image file by the sub-model is to detect the key item key image in the target document image file and give a surrounding A detection key frame around the key image, a key value image is detected and a frame value frame around the key value image is given, and the detection bounding box is given according to the training result.

The business-oriented key-value identification method as described in claim 9, wherein the step of judging whether the range of the key and value of the key falls within the bounding box is to determine the frame of the key to be detected, the frame of the value of the key to be detected Whether it falls within the detection bounding box.