TWI701620B - Document information extraction and archiving system - Google Patents
Document information extraction and archiving system Download PDFInfo
- Publication number
- TWI701620B TWI701620B TW108109781A TW108109781A TWI701620B TW I701620 B TWI701620 B TW I701620B TW 108109781 A TW108109781 A TW 108109781A TW 108109781 A TW108109781 A TW 108109781A TW I701620 B TWI701620 B TW I701620B
- Authority
- TW
- Taiwan
- Prior art keywords
- neural network
- network model
- text
- module
- model
- Prior art date
Links
Images
Landscapes
- Character Input (AREA)
- Character Discrimination (AREA)
Abstract
Description
本發明是指一種文件資訊提取歸檔系統,特別是指一種涉及文字影像的文件資訊提取歸檔系統。 The invention refers to a file information extraction and archiving system, in particular to a file information extraction and archiving system involving text images.
目前,為了評估潛在客戶的既有保單,保險公司的經紀人員必須將既有保單上的資料輸入到保險公司的評估系統中,才能評估潛在客戶的既有保單並對潛在客戶做出進一步的建議。然而,既有保單上的資料眾多且潛在客戶往往只有書面資料無電子資料,經紀人員必須以手動的方式將既有保單上的資料輸入到保險公司的評估系統中,這樣會耗去不少時間,減低開發新客戶的效率。 At present, in order to evaluate the existing insurance policies of potential customers, the insurance company’s brokers must enter the information on the existing insurance policies into the insurance company’s evaluation system in order to evaluate the potential customers’ existing insurance policies and make further recommendations to the potential customers . However, there are many information on existing insurance policies and potential customers often only have written information without electronic information. Brokers must manually enter the information on existing insurance policies into the insurance company’s evaluation system, which will take a lot of time , Reduce the efficiency of developing new customers.
因此,如何自動地擷取既有保單上的資料並將其輸入到經紀人所屬保險公司的評估系統中,便是值得本領域具有通常知識者去思量的課題。 Therefore, how to automatically retrieve the information on the existing insurance policy and input it into the evaluation system of the insurance company to which the broker belongs is a subject worthy of consideration by those with ordinary knowledge in the field.
本發明之目的在於提供一文件資訊提取歸檔系統,能自動提取文件影像上的文字資訊。此文件資訊提取歸檔系統,電性連接到一資料庫,該文件資訊提取歸檔系統包括一輸入模組、一文字分割區域偵測模組、一文字辨識模組、一語義分割模組、與一資料庫對接模組。輸入模組接受一文件影像,該文件影像包括多個文字。文字分割區域偵測模組藉由一第一類神經網路模型對文件影像中的該文字進行框選,以形成至少一文字分割區域。文字辨識模組藉由一第二 類神經網路模型對該文字分割區域中的該文字進行辨識,以取得可編輯的至少一字串,該字串包括可編輯的至少一文字。語義分割模組對該字串進行斷詞以形成多個分詞,並對每一個分詞賦予一詞性。而且,資料庫對接模組是依據該詞性對該分詞與該資料庫的各欄位進行串接。 The purpose of the present invention is to provide a document information extraction and filing system that can automatically extract text information on document images. The document information extraction and filing system is electrically connected to a database. The document information extraction and filing system includes an input module, a text segmentation area detection module, a text recognition module, a semantic segmentation module, and a database Docking module. The input module accepts a document image, and the document image includes multiple characters. The text segmentation area detection module uses a first-type neural network model to frame the text in the document image to form at least one text segmentation area. The text recognition module has a second The neural network-like model recognizes the text in the text segmentation area to obtain at least one editable character string, and the character string includes at least one editable character. The semantic segmentation module hyphenates the word string to form multiple word segments, and assigns a part of speech to each word segment. Moreover, the database docking module concatenates the word segmentation with each field of the database based on the part of speech.
如上述之文件資訊提取歸檔系統,第一類神經網路模型包括一第一卷積式神經網路模型與一目標檢測神經網路模型,第一卷積式神經網路模型對該文件影像進行特徵抽取以輸出一特徵向量,目標檢測神經網路模型根據該特徵向量的輸入對該序號型態之文字進行框選以形成序號分割區域。其中,第一卷積式神經網路模型為VGG模型、ResNet模型、或DenseNet模型。此外,目標檢測神經網路模型為YOLO模型、CTPN模型、或EAST模型。 As in the above-mentioned document information extraction and archiving system, the first type of neural network model includes a first convolutional neural network model and a target detection neural network model. The first convolutional neural network model performs processing on the document image The feature extraction is to output a feature vector, and the target detection neural network model performs frame selection on the serial number type text according to the input of the feature vector to form a serial number segmentation area. Among them, the first convolutional neural network model is a VGG model, a ResNet model, or a DenseNet model. In addition, the target detection neural network model is a YOLO model, a CTPN model, or an EAST model.
如上述之文件資訊提取歸檔系統,第二類神經網路模型包括一第二卷積式神經網路模型與一遞歸式神經網路模型,第二卷積式神經網路模型對該文字分割區域中的圖像進行處理以輸出一文字序列,該遞歸式神經網路模型根據文字序列的輸入以輸出可編輯的該字串。其中,遞歸式神經網路模型實施Connectionist Temporal Classification演算法。 As in the above-mentioned document information extraction and filing system, the second type of neural network model includes a second convolutional neural network model and a recursive neural network model. The second convolutional neural network model divides the text area The image in is processed to output a text sequence, and the recursive neural network model outputs the editable text string according to the input of the text sequence. Among them, the recurrent neural network model implements the Connectionist Temporal Classification algorithm.
如上述之文件資訊提取歸檔系統,第二類神經網路模型為Seq2Seq模型。 As in the above-mentioned document information extraction and filing system, the second type of neural network model is the Seq2Seq model.
如上述之文件資訊提取歸檔系統,語義分割模組更包括一詞庫與一規則模組,其中該詞庫儲存有特定領域的多個專有名詞,而該規則模組則是用於將各種分詞賦予不同的詞性。 As in the above-mentioned document information extraction and filing system, the semantic segmentation module further includes a vocabulary and a rule module. The vocabulary stores multiple proper nouns in a specific field, and the rule module is used to combine various Participles give different parts of speech.
如上述之文件資訊提取歸檔系統,語義分割模組將該些分詞向量化後,利用條件隨機場或隱藏式馬可夫模型對每一個分詞賦予一詞性。 As in the above-mentioned document information extraction and filing system, the semantic segmentation module vectorizes these word segments, and uses conditional random fields or hidden Markov models to assign a part of speech to each word segment.
如上述之文件資訊提取歸檔系統,語義分割模組包括一第三類神經網路模型,該語義分割模組將該字串的每一文字轉換成一固定維度之特徵向量,並將該特徵向量輸入到該第三類神經網路模型,以對每一個分詞賦予一詞性。其中,第三類神經網路模型屬於遞歸式神經網路。而且,第三類神經網路模型包括一條件隨機場層。 As in the above-mentioned document information extraction and filing system, the semantic segmentation module includes a third-type neural network model. The semantic segmentation module converts each character of the string into a fixed-dimensional feature vector, and inputs the feature vector to The third type of neural network model is to give each word segmentation a part of speech. Among them, the third type of neural network model is a recurrent neural network. Moreover, the third type of neural network model includes a conditional random field layer.
如上述之文件資訊提取歸檔系統,資料庫對接模組包括一文字分類器,該文字分類器對同一詞性的分詞進行更進一步的分類。 As in the above-mentioned document information extraction and filing system, the database docking module includes a word classifier, which further classifies the word segmentation of the same part of speech.
藉由本案之文件資訊提取歸檔系統,可將文件影像中的文字自動輸入到資料庫的對應欄位,無需使用到人類的手工輸入,大幅增進行政作業效率。 With the document information extraction and filing system in this case, the text in the document image can be automatically input into the corresponding field of the database, without the need for manual input by humans, which greatly improves the efficiency of administrative operations.
為讓本之上述特徵和優點能更明顯易懂,下文特舉較佳實施例,並配合所附圖式,作詳細說明如下。 In order to make the above-mentioned features and advantages of the present invention more obvious and understandable, a detailed description is given below of preferred embodiments in conjunction with the accompanying drawings.
10:文件影像 10: File image
12:文字分割區域 12: Text segmentation area
12a:圖片序列 12a: Picture sequence
30:資料庫 30: Database
40:影像輸入裝置 40: Video input device
100:文件資訊提取歸檔系統 100: Document information extraction and filing system
110:輸入模組 110: Input module
115:影像前處理模組 115: image pre-processing module
120:文字分割區域偵測模組 120: Text segmentation area detection module
122:第一類神經網路模型 122: The first type of neural network model
1221:第一卷積式神經網路模型 1221: The first convolutional neural network model
1223:目標檢測神經網路模型 1223: Neural network model for target detection
130:文字辨識模組 130: text recognition module
132:第二類神經網路模型 132: The second type of neural network model
1321:第二卷積式神經網路模型 1321: The second convolutional neural network model
1323:遞歸式神經網路模型 1323: Recurrent neural network model
140:語義分割模組 140: Semantic Segmentation Module
141:第三類神經網路模型 141: The third type of neural network model
1411:嵌入向量層 1411: Embedding vector layer
1413:遞歸式神經網路層 1413: Recurrent neural network layer
1415:激勵函數層 1415: excitation function layer
1417:條件隨機場層 1417: Conditional Random Field Layer
142:詞庫 142: Thesaurus
143:規則模組 143: Rule Module
150:資料庫對接模組 150: Database docking module
151:文字分類器 151: text classifier
下文將根據附圖來描述各種實施例,所述附圖是用來說明而不是用以任何方式來限制範圍,其中相似的標號表示相似的組件,並且其中:圖1所繪示為本發明之文件資訊提取歸檔系統的實施例。 Hereinafter, various embodiments will be described based on the accompanying drawings, which are used for illustration rather than limiting the scope in any way, wherein similar reference numerals indicate similar components, and among them: FIG. 1 is a drawing of the present invention. An embodiment of a file information extraction and filing system.
圖2A所繪示為保單的文件影像。 Figure 2A shows the document image of the insurance policy.
圖2B所繪示為經過影像前處理的保單的文件影像。 Figure 2B shows the document image of the insurance policy after image preprocessing.
圖2C所繪示為具有文字分割區域的文件影像。 FIG. 2C shows a document image with text division areas.
圖3所繪示為第一類神經網路模型的架構圖。 Figure 3 shows the architecture diagram of the first type of neural network model.
圖4所繪示為第二類神經網路模型的架構圖。 Figure 4 shows the architecture diagram of the second type of neural network model.
圖5所繪示為將文字分割區域拆解成多個圖片序列的示意圖。 FIG. 5 shows a schematic diagram of disassembling the text segmentation area into multiple picture sequences.
圖6所繪示為語義分割模組的架構圖。 FIG. 6 shows the structure diagram of the semantic segmentation module.
圖7所繪示為第三類神經網路模型的架構圖。 Figure 7 shows the architecture diagram of the third type of neural network model.
參照本文闡述的詳細內容和附圖說明是最好理解本發明。下面參照附圖會討論各種實施例。然而,本領域技術人員將容易理解,這裡關於附圖給出的詳細描述僅僅是為了解釋的目的,因為這些方法和系統可超出所描述的實施例。例如,所給出的教導和特定應用的需求可能產生多種可選的和合適的方法來實 現在此描述的任何細節的功能。因此,任何方法可延伸超出所描述和示出的以下實施例中的特定實施選擇範圍。 The present invention is best understood with reference to the detailed content set forth herein and the description of the drawings. Various embodiments will be discussed below with reference to the drawings. However, those skilled in the art will easily understand that the detailed description given here with respect to the drawings is only for explanatory purposes, because these methods and systems may exceed the described embodiments. For example, the given teachings and specific application requirements may produce a variety of alternative and suitable methods to implement Now this describes the function in any detail. Therefore, any method can extend beyond the specific implementation options described and illustrated in the following embodiments.
在說明書及後續的申請專利範圍當中使用了某些詞彙來指稱特定的元件。所屬領域中具有通常知識者應可理解,不同的廠商可能會用不同的名詞來稱呼同樣的元件。本說明書及後續的申請專利範圍並不以名稱的差異來作為區分元件的方式,而是以元件在功能上的差異來作為區分的準則。在通篇說明書及後續的請求項當中所提及的「包含」或「包括」係為一開放式的用語,故應解釋成「包含但不限定於」。另外,「耦接」或「連接」一詞在此係包含任何直接及間接的電性連接手段。因此,若文中描述一第一裝置耦接於一第二裝置,則代表該第一裝置可直接電性連接於該第二裝置,或透過其他裝置或連接手段間接地電性連接至該第二裝置。 In the specification and subsequent patent applications, certain words are used to refer to specific elements. Those with ordinary knowledge in the field should understand that different manufacturers may use different terms to refer to the same components. The scope of this specification and subsequent patent applications does not use differences in names as a way to distinguish elements, but uses differences in functions of elements as a criterion for distinguishing. The "include" or "include" mentioned in the entire specification and the subsequent request items is an open term, so it should be interpreted as "include but not limited to". In addition, the term "coupled" or "connected" herein includes any direct and indirect electrical connection means. Therefore, if it is described that a first device is coupled to a second device, it means that the first device can be directly electrically connected to the second device, or indirectly electrically connected to the second device through other devices or connection means. Device.
請參閱圖1,圖1所繪示為本發明之文件資訊提取歸檔系統的實施例。文件資訊提取歸檔系統100包括一輸入模組110、一影像前處理模組115、一文字分割區域偵測模組120、一文字辨識模組130、一語義分割模組140、與一資料庫對接模組150,其中資料庫對接模組150是與一資料庫30連接。在本實施例中,資料庫30例如為保險公司的資料庫,此資料庫包括多個欄位,例如:姓名、身分證字號、投保類別、投保金額...等等。此外,輸入模組110例如是電性連接到一影像輸入裝置40,此影像輸入裝置40例如為一掃描裝置、一數位相機、或具有拍照功能的一智慧型手機。藉由此影像輸入裝置40,可將一文件影像(例如:保單,圖2A所示的相片)匯入到影像前處理模組115中。此影像前處理模組115能對該文件影像進行影像前處理,例如:方向轉正、曲面校正、圖片去噪、二值化等,以讓文件影像具有高對比之特性(如圖2B所示的文件影像10),以方便後續的處理。在本實施例中,輸入模組110、影像前處理模組115、文字分割區域偵測模組120、文字辨識模組130、語義分割模組140、與資料庫對接模組150是設置於伺服端,伺服端例如是由一台或多台伺服器所組成。須注意的是,為了保護個人的隱私,圖2A中的要保人與被保險
人姓名、保單號碼都進行遮蓋,圖2B中的要保人與被保險人姓名、保單號碼則進行變造。
Please refer to FIG. 1. FIG. 1 illustrates an embodiment of the document information extraction and filing system of the present invention. The document information extraction and
經過前處理後的文件影像10會被傳輸到文字分割區域偵測模組120,文字分割區域偵測模組120包括第一類神經網路模型122,此第一類神經網路模型122能對文件影像10中的文字進行框選,以形成至少一文字分割區域12(圖2C所示為多個)。須注意的是,文字分割區域12中的文字是以影像的方式存在的,也就是說文字分割區域12中的文字在這個階段是無法編輯的。為了將這些文字轉為可編輯的文字,可藉由文字辨識模組130來完成。以下,將介紹文字分割區域偵測模組120與文字辨識模組130較詳細的運作機制。
The
請同時參照圖3,第一類神經網路模型122包括一第一卷積式神經網路模型1221與一目標檢測神經網路模型1223,此第一卷積式神經網路模型1221屬於卷積式神經網路(convolutional neural network),包括卷積層(convolutional layer)與採樣層(pooling layer)(卷積層與採樣層皆未於圖中繪式),其中卷積層主要用於特徵抽取,而採樣層則是用於減少第一卷積式神經網路模型1221所需的參數,以免產生過擬合(overfitting)的情形。第一卷積式神經網路模型1221根據所輸入的文件影像10產生一特徵向量,之後特徵向量再輸入到此目標檢測神經網路模型1223。在本實施例中,第一卷積式神經網路模型1221可為VGG模型、ResNet模型、或DenseNet模型。此外,目標檢測神經網路模型1223可為YOLO模型,較佳為CTPN模型或EAST模型。在經過目標檢測神經網路模型1223的演算後,文件影像10中的文字便會被框選,而形成上述的文字分割區域12(如圖2C所示)。
3, the first type of
待文件影像10中的文字被框選以形成文字分割區域12後,文字辨識模組130便會藉由一第二類神經網路模型132對文字分割區域12中的文字進行辨識。請同時參照圖4,第二類神經網路模型132包括一第二卷積式神經網路模型1321與一遞歸式神經網路模型1323,此第二卷積式神經網路模型1321與第一卷積式神經網路模型1221一樣同屬於卷積式神經網路(convolutional neural
network),此第二卷積式神經網路模型1321可對文字分割區域12中的文字進行預判斷。雖然第二卷積式神經網路模型1321可對文字分割區域12中的文字進行初步判斷,但較佳還是須在第二卷積式神經網路模型1321加上遞歸式神經網路模型1323,以對文字分割區域12中的文字進行更佳地辨識,相關詳細機制將在後文敘述。
After the text in the
第二卷積式神經網路模型1321在對文字分割區域12中的文字進行辨識時,會先將文字分割區域12拆解成多個圖片序列12a(如圖5)。舉例來說,若文字分割區域12包括「S」這個字元,則這些圖片序列12a可能是「S」的左邊部分、也可能是「S」的右邊部分,這樣一來第二卷積式神經網路模型1321有可能將「S」這個字元識別成這二個「S」字元。或者,反過來也可能將多個字元辨識成一個,比如「llc.」這個字串,第二卷積式神經網路模型1321可能將當中的二個l(“ll”)視為一個l(“1”)。遞歸式神經網路模型1323是屬於遞歸式神經網路(Recurrent Neural Network,RNN),由於遞歸式神經網路會參考到之前的輸入也就是說具有短期記憶的功能,因此可以對第二卷積式神經網路模型1321可能的輸出錯誤進行校正,而正確辨識出文字分割區域12中的文字。
When the second convolutional
在本實施例中,遞歸式神經網路模型1323例如是採用Connectionist Temporal Classification演算法(以下簡稱CTC演算法)。目前,CTC演算法主要是用在語音識別上,其詳細的運作原理可參考以下網頁:“Sequence Modeling With CTC”(https://distill.pub/2017/ctc/)
In this embodiment, the recursive
本案的創作人經研究後發現,CTC演算法也可以適用於本案的文字辨識且具有良好的效果,主要原因在於語音辨識的情境與本案文字辨識的情境有部分共同之處。在語音辨識中一些比較常見的情形是:有些人講話比較快,有些人講話比較慢,或者某些人在某些音素會拉得比較長;而CTC演算法正式針對這些狀況開發出來的。而在本案的文字辨識中,有些文件中字元與字元之間的間距會拉得比較開(對應到語音辨識中有些人講話比較慢),有些文件中字元與字元之間的間距會拉得比較緊湊(對應到語音辨識中有些人講話比較快),而 且本案中的文件影像有可能經由拍照取得的,這樣一來更可能因為拍照者拍攝的角度或遠近而產生文件中字元與字元之間的間距有所變化。因此,本案的創作人採用CTC演算法解決這樣的問題並獲得良好的效果。 After research, the creator of this case found that the CTC algorithm can also be applied to the text recognition of this case and has good results. The main reason is that the context of speech recognition and the context of text recognition in this case have some in common. Some of the more common situations in speech recognition are: some people speak faster, some people speak slower, or some people stretch longer in certain phonemes; and the CTC algorithm is officially developed for these situations. In the text recognition in this case, the spacing between characters in some documents will be relatively open (corresponding to some people speaking slowly in voice recognition), and the spacing between characters in some documents will be widened. Should be more compact (corresponding to some people speaking faster in speech recognition), and Moreover, the document image in this case may be obtained by taking a photo, so it is more likely that the distance between characters in the document will change due to the angle or distance of the photographer. Therefore, the creator of this case uses the CTC algorithm to solve such problems and obtain good results.
此外,第二類神經網路模型132也可以為Seq2Seq模型。Seq2Seq模型一般包括一編碼器(Encoder)和一解碼器(Decoder),其中編碼器可為卷積式神經網路,其也會先將文字分割區域12拆解成多個圖片序列12a(如圖5),並將圖片序列12a轉換成一個上下文向量(context vector),之後再將該上下文向量輸入到解碼器,解碼器再將該上下文向量轉換成可編輯的字串。
In addition, the second type of
值得注意的是,由於擷取文件影像(如圖2A所示)牽涉到拍照,便會產生不同人有不同拍攝角度的情況發生,因此第一類神經網路模型122與第二類神經網路模型132在訓練時可輸入不同角度、各種光線環境的下的文件影像,這些不同角度、各種光線環境的下的文件影像可直接利用電腦模擬的方式取得。
It is worth noting that because capturing document images (as shown in Figure 2A) involves taking pictures, different people have different shooting angles. Therefore, the first type of
在經由文字辨識模組130取得可編輯字串後,便可經由語義分割模組140對字串進行斷詞以形成多個分詞,並對每一個分詞賦予一詞性。在本實施例中,可使用jieba這個分詞程式進行斷詞以形成上述的分詞。此外,請參照圖6,語義分割模組140包括一詞庫142與一規則模組143,詞庫142例如儲存有特定領域的多個專有名詞,而規則模組143則是用於將各種分詞賦予不同的詞性,例如將「孔乙己」賦予人名這個詞性,將「台北市大直」賦予地名這個詞性,將「南山人壽」賦予企業名這個詞性,將「102/12/31」賦予日期這個詞性等。規則模組143除了可根據該分詞本身的特性進行詞性的賦予外,還可以根據該分詞於該字串中所在的位置來判斷,例如使用CYK算法(Cocke-Younger-Kasami algorithm,縮寫為CYK algorithm)。
After the editable character string is obtained through the
在其他實施例中,語義分割模組140將上述分詞向量化後,利用條件隨機場(Conditional Random Field,CRF)或隱藏式馬可夫模型(Hidden Markov Model)對每一個分詞賦予一詞性。
In other embodiments, the
在較佳的實施例中,如圖7所示,語義分割模組140則是包括一第三類神經網路模型141,此第三類神經網路模型141包括一嵌入向量層(Embedding layer)1411、一遞歸式神經網路層(RNN layer)1413、一激勵函數層(activation layer)1415、與一條件隨機場層(CFR layer)1417。在本實施例中,語義分割模組140會將字串的每一文字轉換成一固定維度之特徵向量,這些特徵向量即構成嵌入向量層1411。在此,遞歸式神經網路層1413除了可為基本的遞歸式神經網路或雙向遞歸式神經網路(Bi-RNN,如圖7中所示)外,還可包括長短期記憶遞歸式神經網路(LSTM-RNN,LSTM為Long Short-Term Memory的縮寫)、雙向長短期記憶遞歸式神經網路(BLSTM-RNN,BLSTM為Bidirectional Long Short-Term Memory的縮寫)、GRU遞歸式神經網路(GRU-RNN,GRU為Gated Recurrent Unit的縮寫)。此外,激勵函數層1415的激勵函數例如為tanh函數,而第三類神經網路模型141會加入條件隨機場層1417的原因在於條件隨機場在序列的標註上具有優勢。在經過第三類神經網路模型141的運算後,字串中的每個文字會被賦予一詞性,相關事例如下表格,其中1代表人名,2代表地名,3代表企業名、4為日期、5則為其他。
In a preferred embodiment, as shown in FIG. 7, the
請回去參照圖1,在完成字串中各分詞的詞性標註後,資料庫對接模組150便會依據各分詞的詞性對該分詞與資料庫30的各欄位進行串接。舉例來說,圖2B中的「孔乙己」就會被歸類到資料庫中30與人名相關的欄位。在較佳的實施例中,資料庫對接模組150還包括一文字分類器151,此文字分類器151可對同一詞性的分詞進行更進一步的分類。舉例來說,一份文件中可能出現不同的人名,文字分類器151可判斷這些人名中何為要保人、何為非要保人,而根據的規則例如為距離要保人這個詞最近的人名為要保人。
Please refer back to FIG. 1. After completing the part-of-speech tagging of each word segment in the string, the
當圖2A中的文件影像中的各文字都藉由本實施例的文件資訊提取歸檔系統100而歸類到資料庫30中所對應的各欄位後,保險公司的經紀人員便可利用所屬保險公司的評估系統20對潛在客戶的既有保單進行評估並做出進一步的建議。相較於習知的做法,經紀人員無須以手動的方式將既有保單上的資料輸入到保險公司的評估系統中,這樣會節省不少時間,增加開發新客戶的效率。
When the text in the document image in FIG. 2A is classified into the corresponding fields in the
值得注意的是,在上述中雖然是以既有保單為文件影像的實施例,但本領域具有通常知識者應可得知,本案之文件資訊提取歸檔系統還可適用於其他種類的文件,比如:委任書、合約、判決書等,而無需使用到人類的手工輸入,大幅增進行政作業效率。 It is worth noting that although the above example uses an existing insurance policy as the document image, those with ordinary knowledge in the field should know that the document information extraction and filing system in this case can also be applied to other types of documents, such as : Appointments, contracts, judgments, etc., without the use of human manual input, greatly improving the efficiency of administrative operations.
雖然本發明已以較佳實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 Although the present invention has been disclosed as above in preferred embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the relevant technical field can make some changes and modifications without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention shall be subject to those defined by the attached patent scope.
30:資料庫 30: Database
40:影像輸入裝置 40: Video input device
100:文件資訊提取歸檔系統 100: Document information extraction and filing system
110:輸入模組 110: Input module
115:影像前處理模組 115: image pre-processing module
120:文字分割區域偵測模組 120: Text segmentation area detection module
122:第一類神經網路模型 122: The first type of neural network model
130:文字辨識模組 130: text recognition module
132:第二類神經網路模型 132: The second type of neural network model
140:語義分割模組 140: Semantic Segmentation Module
150:資料庫對接模組 150: Database docking module
151:文字分類器 151: text classifier
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108109781A TWI701620B (en) | 2019-03-21 | 2019-03-21 | Document information extraction and archiving system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108109781A TWI701620B (en) | 2019-03-21 | 2019-03-21 | Document information extraction and archiving system |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI701620B true TWI701620B (en) | 2020-08-11 |
TW202036399A TW202036399A (en) | 2020-10-01 |
Family
ID=73003045
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108109781A TWI701620B (en) | 2019-03-21 | 2019-03-21 | Document information extraction and archiving system |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI701620B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114419611A (en) * | 2020-10-12 | 2022-04-29 | 八维智能股份有限公司 | Real-time short message robot system and method for automatically detecting character lines in digital image |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1669029A (en) * | 2002-05-17 | 2005-09-14 | 威乐提公司 | System and method for automatically discovering a hierarchy of concepts from a corpus of documents |
US8060513B2 (en) * | 2008-07-01 | 2011-11-15 | Dossierview Inc. | Information processing with integrated semantic contexts |
US8346534B2 (en) * | 2008-11-06 | 2013-01-01 | University of North Texas System | Method, system and apparatus for automatic keyword extraction |
TW201833793A (en) * | 2017-03-02 | 2018-09-16 | 大陸商騰訊科技(深圳)有限公司 | Semantic extraction method and device of natural language and computer storage medium |
TWM583974U (en) * | 2019-03-21 | 2019-09-21 | 洽吧智能股份有限公司 | Document information retrieval and filing system |
-
2019
- 2019-03-21 TW TW108109781A patent/TWI701620B/en not_active IP Right Cessation
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1669029A (en) * | 2002-05-17 | 2005-09-14 | 威乐提公司 | System and method for automatically discovering a hierarchy of concepts from a corpus of documents |
US8060513B2 (en) * | 2008-07-01 | 2011-11-15 | Dossierview Inc. | Information processing with integrated semantic contexts |
US8346534B2 (en) * | 2008-11-06 | 2013-01-01 | University of North Texas System | Method, system and apparatus for automatic keyword extraction |
TW201833793A (en) * | 2017-03-02 | 2018-09-16 | 大陸商騰訊科技(深圳)有限公司 | Semantic extraction method and device of natural language and computer storage medium |
TWM583974U (en) * | 2019-03-21 | 2019-09-21 | 洽吧智能股份有限公司 | Document information retrieval and filing system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114419611A (en) * | 2020-10-12 | 2022-04-29 | 八维智能股份有限公司 | Real-time short message robot system and method for automatically detecting character lines in digital image |
Also Published As
Publication number | Publication date |
---|---|
TW202036399A (en) | 2020-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109117777B (en) | Method and device for generating information | |
TWM583974U (en) | Document information retrieval and filing system | |
WO2022105125A1 (en) | Image segmentation method and apparatus, computer device, and storage medium | |
Ke et al. | Vila: Learning image aesthetics from user comments with vision-language pretraining | |
CN113641820A (en) | Visual angle level text emotion classification method and system based on graph convolution neural network | |
CN110188775B (en) | Image content description automatic generation method based on joint neural network model | |
WO2020238046A1 (en) | Human voice smart detection method and apparatus, and computer readable storage medium | |
WO2020147409A1 (en) | Text classification method and apparatus, computer device, and storage medium | |
CN116861995A (en) | Training of multi-mode pre-training model and multi-mode data processing method and device | |
CN113255557B (en) | Deep learning-based video crowd emotion analysis method and system | |
WO2021012493A1 (en) | Short video keyword extraction method and apparatus, and storage medium | |
WO2021159718A1 (en) | Named entity recognition method and apparatus, terminal device and storage medium | |
CN114140803B (en) | Document single word coordinate detection and correction method and system based on deep learning | |
WO2023178930A1 (en) | Image recognition method and apparatus, training method and apparatus, system, and storage medium | |
CN111078887A (en) | Text classification method and device | |
CN114724548A (en) | Training method of multi-mode speech recognition model, speech recognition method and equipment | |
CN113297379A (en) | Text data multi-label classification method and device | |
TWI701620B (en) | Document information extraction and archiving system | |
CN113538413B (en) | Image detection method and device, electronic equipment and storage medium | |
WO2021027117A1 (en) | Speech emotion recognition method and appartus, and computer-readable storage medium | |
Liu et al. | Residual recurrent CRNN for end-to-end optical music recognition on monophonic scores | |
AlSalman et al. | A Deep Learning-Based Recognition Approach for the Conversion of Multilingual Braille Images. | |
US11699297B2 (en) | Image analysis based document processing for inference of key-value pairs in non-fixed digital documents | |
CN116109980A (en) | Action recognition method based on video text matching | |
WO2022111688A1 (en) | Face liveness detection method and apparatus, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |