TW201245981A - Method of automatically detecting error by using language model for print - Google Patents

Method of automatically detecting error by using language model for print Download PDF

Info

Publication number
TW201245981A
TW201245981A TW100115891A TW100115891A TW201245981A TW 201245981 A TW201245981 A TW 201245981A TW 100115891 A TW100115891 A TW 100115891A TW 100115891 A TW100115891 A TW 100115891A TW 201245981 A TW201245981 A TW 201245981A
Authority
TW
Taiwan
Prior art keywords
language model
error
printer
printed
content
Prior art date
Application number
TW100115891A
Other languages
Chinese (zh)
Other versions
TWI456411B (en
Inventor
Ping-Cheng Lin
Jui-Feng Yeh
Original Assignee
Univ Far East
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Univ Far East filed Critical Univ Far East
Priority to TW100115891A priority Critical patent/TWI456411B/en
Publication of TW201245981A publication Critical patent/TW201245981A/en
Application granted granted Critical
Publication of TWI456411B publication Critical patent/TWI456411B/en

Links

Abstract

The present invention discloses a method of automatically detecting an error using a language model for a printer, comprising the following steps of: performing an action of text extraction for the print content of a print document; performing an action of word segmentation for the print content of the print document; and performing error detection for the print content of the print document using the language model. If there is no error in the print content of the print document after the error detection, the print document is printed out directly. On the contrary, if an error is detected in the print content of print document, an alert is generated or the error is corrected automatically. After generating the alert or finishing the correction of the error, a user will be asked whether to content to print or not.

Description

201245981 六、發明說明: 【發明所屬之技術領域】 [0001] 本發明是有關於一種錯誤偵測方法,特別是有關於印表 機之利用語言模型自動偵測錯誤之方法。 【先前技術】 [0002] 目前在列印環保設計上多半使用縮小及雙面列印等方法 來節省紙張,並利用再生紙與自然材料以減少對於大自 然環境的污染。就算有考量到列印内容的部份,也只有 將列印内容區分為空白區域或資料區域,或是進行初步 〇 的圖文分離後,針對文字或圖片的部份分別作縮小,但 完全沒有使用知識庫或自然語言的方法對列印内容進行 進一步的分析。準此,就省紙及節碳的部份而言,浪費 莫過於印錯再重印。因此,本發明擬以内容檢查之方式 進行錯誤偵測,以避免列印到含有錯誤之資料以節省資 源。 【發明内容】 [0003] Ο 綜合所述,關於習知技術所衍生之問題,本發明目的之 一就是在提供一種印表機之利用語言模型自動偵測錯誤 之方法以解決習知技術之問題。 [0004] 具體而言,本發明之印表機之利用語言模型自動偵測錯 誤之方法包含下列各步驟: 步驟S1 :針對一列印文件之一列印内容進行一文字抽取 之動作。 步驟S2 :針對該列印文件之該列印内容進行一斷詞之動 作。 100115891 表單編號A0101 第3頁/共8頁 1002026650-0 201245981 步驟S3 :利用一語言模型針對該列印文件之該列印内容 進行一錯誤偵測。 [0005] 依上開之闡釋,本發明之印表機之利用語言模型自動偵 測錯誤之方法,其可具有一或多個下述優點: (1) 按本發明之印表機之利用語言模型自動偵測錯誤之方 法,倘若進行錯誤偵測後,該列印文件之列印内容無錯 誤,則列印文件直接被列印出來。反之,若偵測該列印 文件之列印内容有錯誤,則產生一警訊或自動修正該錯 誤。 (2) 利用本發明之印表機之利用語言模型自動偵測錯誤之 方法,可避免紙張以及時間之浪費。 (3) 利用本發明之印表機之利用語言模型自動偵測錯誤之 方法,可導入自然語言處理之方法於週邊產品之開發。 【實施方式】 [0006] 首先陳明,謹遵照如後之相關圖式,茲以說明本發明之 印表機之利用語言模型自動偵測錯誤之方法之主要實施 例。另為便於鈞審理解,所述之實施例中所附之相同元 件係以相同之符號標示以說明之,於此合先述明。 [0007] 謹遵於此,懇請參閱第1圖,其係為本發明之印表機之利 用語言模型自動偵測錯誤之方法之步驟流程圖。誠如第1 圖所示,本發明之印表機之利用語言模型自動偵測錯誤 之方法係適用於印表機之列印文件上,其包含下列步驟 步驟S1 :針對一列印文件之一列印内容進行一文字抽取 之動作。實際上,就列印文件之列印内容而言,常存在 100115891 表單編號A0101 第4頁/共8頁 1002026650-0 201245981 Ο 著圖文混雜的情況,就算列印内容完全是純文字,也會 填充著許許多多標註符號。爰此,在進入錯誤偵測階段 前,所述之文字抽取之動作必須將該列印内容之圖文分 離,並抽取出純文字的部份,以供後續之比較分析。 步驟S2 :針對該列印文件之該列印内容進行一斷詞之動 作。具體而言,中文與英文最大不同處在於中文字詞與 字詞之間並無空白間隔。也就是說,在進行列印内容錯 誤偵測前,必須要有斷詞之機制。故,吾人擬以字詞為 分析之基本單元,並將列印内容之文字串根據字詞的基 本單元做斷詞。201245981 VI. Description of the Invention: [Technical Field of the Invention] [0001] The present invention relates to an error detection method, and more particularly to a method for automatically detecting an error using a language model of a printer. [Prior Art] [0002] At present, most of the environmentally-friendly printing designs use methods such as reduction and double-sided printing to save paper and use recycled paper and natural materials to reduce pollution to the natural environment. Even if you consider the part of the printed content, you can only divide the printed content into blank areas or data areas, or after the preliminary image separation, the text or image parts are reduced separately, but there is no such thing. Further analysis of the printed content using a knowledge base or natural language approach. In view of this, in terms of paper saving and carbon saving, the waste is no more than a mistake. Therefore, the present invention intends to perform error detection by means of content inspection to avoid printing to data containing errors to save resources. SUMMARY OF THE INVENTION [0003] In summary, with regard to the problems derived from the prior art, one of the objects of the present invention is to provide a method for automatically detecting errors by using a language model of a printer to solve the problems of the prior art. . Specifically, the method for automatically detecting errors by the language model of the printer of the present invention comprises the following steps: Step S1: performing a text extraction operation for printing the content of one of the printed documents. Step S2: Perform a word break operation on the printed content of the printed document. 100115891 Form No. A0101 Page 3 of 8 1002026650-0 201245981 Step S3: An error detection is performed on the printed content of the printed document using a language model. [0005] According to the above explanation, the printer of the present invention automatically detects errors by using a language model, which may have one or more of the following advantages: (1) The language of use of the printer according to the present invention The method of automatically detecting the error by the model, if there is no error in the printed content of the printed document after the error detection, the printed document is directly printed. Conversely, if an error is detected in the printed content of the printed document, a warning is generated or the error is automatically corrected. (2) The method of automatically detecting errors using the language model of the printer of the present invention can avoid paper and time waste. (3) The method of automatically detecting errors using the language model of the printer of the present invention can be introduced into the development of peripheral products by the method of natural language processing. [Embodiment] [0006] First of all, Chen Ming, in order to follow the related drawings, will explain the main embodiment of the method for automatically detecting errors by the language model of the printer of the present invention. For the sake of easy understanding, the same components as those in the embodiments are denoted by the same reference numerals, and are described in the foregoing. [0007] In view of this, please refer to FIG. 1, which is a flow chart of the steps of the method for automatically detecting errors by using the language model of the printer of the present invention. As shown in Fig. 1, the method for automatically detecting errors by the language model of the printer of the present invention is applied to a printed document of a printer, which comprises the following steps: S1: printing one of the printed documents The content performs a text extraction action. In fact, in terms of the printed contents of the printed documents, there are often 100115891 Form No. A0101 Page 4 / Total 8 Pages 1002026650-0 201245981 In the case of mixed graphics, even if the printed content is completely plain text, Filled with many more notation symbols. Therefore, before entering the error detection phase, the action of extracting the text must separate the graphic text of the printed content, and extract the pure text portion for subsequent comparative analysis. Step S2: Perform a word break operation on the printed content of the printed document. Specifically, the biggest difference between Chinese and English is that there is no gap between Chinese characters and words. In other words, before the error detection of the printed content, there must be a mechanism for breaking words. Therefore, we intend to use the word as the basic unit of analysis, and the text string of the printed content is broken according to the basic unit of the word.

步驟S3 :利用一語言模型針對該列印文件之該列印内容 進行一錯誤偵測。其中,所述之語言模型係一包含自然 語言模式之語言模型。倘若,進行錯誤偵測後,該列印 文件之列印内容無錯誤,則列印文件直接被列印出來。 反之,若偵測該列印文件之列印内容有錯誤,則產生一 警訊或自動修正該錯誤。當產生警訊號或者修正完該錯 誤後,系統會詢問使用者是否繼續列印。除此之外,本 發明藉由微軟公司之WDK驅動程式進行印表機列印功能 之開發,並在偵測列印内容含有打字錯誤時發出警告訊 息以提醒使用者,進而避免紙張以及時間之浪費。是以 ,利用上述錯誤偵測之機制,可提前偵測列印内容含有 錯誤,進而避免列印含有錯誤内容之文件。不僅如此, 更藉以節省紙張碳粉與列印時間,及提高列印之可靠度 ,並得以邁向永續發展。 [0008] 據此,以上所說明者述僅為較佳實施例之揭示,而非用 100115891 表單編號Α0101 第5頁/共8頁 1002026650-0 201245981 以之限定本發明之主要技術特徵。惟,任何未脫離本發 明之精神與範疇,而進行等效之修改或變更,均應包含 於後所附之申請專利範圍中。 【圖式簡單說明】 [0009] 第1圖係為本發明之印表機之利用語言模型自動偵測錯誤 之方法之步驟流程圖。 【主要元件符號說明】 [0010] S卜S3 :印表機之利用語言模型自動偵測錯誤之方法之步 驟 100115891 表單編號A0101 第6頁/共8頁 1002026650-0Step S3: performing an error detection on the printed content of the printed document by using a language model. Wherein, the language model is a language model containing a natural language pattern. If, after error detection, the printed document has no errors in the printed content, the printed document is directly printed. Conversely, if an error is detected in the printed content of the printed document, a warning is generated or the error is automatically corrected. When a warning signal is generated or the error is corrected, the system asks the user whether to continue printing. In addition, the present invention develops the printing function of the printer by the WDK driver of Microsoft Corporation, and sends a warning message to remind the user when detecting that the printing content contains a typing error, thereby avoiding paper and time. waste. Therefore, by using the above-mentioned error detection mechanism, it is possible to detect in advance that the printed content contains an error, thereby avoiding printing a file containing the wrong content. Not only that, but also to save paper toner and printing time, and improve the reliability of printing, and to achieve sustainable development. Accordingly, the above description is only a disclosure of the preferred embodiment, and instead of 100115891 Form No. 1010101, page 5/8 pages, 1002026650-0 201245981, the main technical features of the present invention are defined. However, any equivalent modifications or alterations that do not depart from the spirit and scope of the invention are intended to be included in the scope of the appended claims. BRIEF DESCRIPTION OF THE DRAWINGS [0009] Fig. 1 is a flow chart showing the steps of a method for automatically detecting an error using a language model of the printer of the present invention. [Main component symbol description] [0010] SBu S3: Steps of the printer's automatic detection of errors using the language model 100115891 Form No. A0101 Page 6 of 8 1002026650-0

Claims (1)

201245981 七、申請專利範圍: 種印表機之利用語言模型自動偵測錯誤之方法,包含. 7用一語言模型針對―列印文件之—列印内容進行一錯誤 4貞測。 2 .如申請專利範圍第1項所述之印表機之利用語言模型自動 貞則錯誤之方法,於該錯誤偵測前更包含針對該列印文件 之該列印内容進行-文字抽取之動作。 申明專利fell第2項所述之印表機之湘語言模型自動 Ο 彳貞龜誤之料’於該文字抽取之動作後更包含針對該列 印文件之該列印内容進行一斷詞之動作。 利範圍第3項所述之印表機之利用語言模型自動 摘琪J錯誤之方法,其中該斷詞之動作係以字詞為一基本單 70 ’將該列印内容之文字串依據該基本單元做斷詞。 •如申清專利範圍第2項所述之印表機之利用語言模型自動 貞&錯誤之方法’其中該文字抽取之動作係將該列印内容 之圖文分離,並抽取出純文字的部份。 〇 .如申研專利範圍第丨項所述之印表機之利用語言模型自動 #測錯誤之方法’其中進行該錯誤制後,若偵測該列印 文件之該列印内谷無錯誤’則該列印文件直接被列印出來 ’反之’若偵測該列印文件之該列印内容有一錯誤,則產 生一警訊或自動修正該錯誤。 7 ·如申請專利範圍第1項所述之印表機之利用語言模型自動 谓測錯誤之方法’其中該語言模型係包含一自然語言模式 之語言模型。 100115891 表單編號A0101 1002026650-0201245981 VII. Patent application scope: The method of automatically detecting errors by using the language model of the printer, including 7. Using a language model to make an error for the printing content of the printed document. 2. The method for automatically using the language model of the printer as described in claim 1 of the patent application, further comprising performing a text extraction operation on the printed content of the printed document before the error detection. Declaring the Xiang language model of the printer described in the second item of the patent, the 彳贞 误 误 ' ' 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于 于. The method for automatically extracting the error of the printer using the language model described in item 3 of the benefit range, wherein the action of the word break is based on the word as a basic single 70 'the text string of the printed content according to the basic The unit is a word breaker. • The method for automatically using the language model of the printer as described in item 2 of the patent scope of the patent application, wherein the action of the text extraction separates the text of the printed content and extracts the text. Part. 〇. For example, the method of using the language model of the printer according to the second paragraph of the patent application scope of the patent application, the method of measuring the error, in which the error is detected, if the printing of the printed document is detected, there is no error in the printing valley. Then, the printed document is directly printed, and vice versa. If an error is detected in the printed content of the printed document, a warning is generated or the error is automatically corrected. 7. The method for automatically predicting errors using a language model of a printer as described in claim 1 wherein the language model comprises a language model of a natural language model. 100115891 Form number A0101 1002026650-0
TW100115891A 2011-05-06 2011-05-06 Method of autometically detecting error by using language model for print TWI456411B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW100115891A TWI456411B (en) 2011-05-06 2011-05-06 Method of autometically detecting error by using language model for print

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW100115891A TWI456411B (en) 2011-05-06 2011-05-06 Method of autometically detecting error by using language model for print

Publications (2)

Publication Number Publication Date
TW201245981A true TW201245981A (en) 2012-11-16
TWI456411B TWI456411B (en) 2014-10-11

Family

ID=48094443

Family Applications (1)

Application Number Title Priority Date Filing Date
TW100115891A TWI456411B (en) 2011-05-06 2011-05-06 Method of autometically detecting error by using language model for print

Country Status (1)

Country Link
TW (1) TWI456411B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR0171545B1 (en) * 1996-01-12 1999-05-01 김광호 Printing system by paper lenght automatic sensing and controlling method thereof
JP2006062266A (en) * 2004-08-27 2006-03-09 Seiko Epson Corp Printer and control method for printer
US7774193B2 (en) * 2006-12-05 2010-08-10 Microsoft Corporation Proofing of word collocation errors based on a comparison with collocations in a corpus
TWI341255B (en) * 2008-04-30 2011-05-01 System of ink box detection

Also Published As

Publication number Publication date
TWI456411B (en) 2014-10-11

Similar Documents

Publication Publication Date Title
US9489155B2 (en) Image processing device
CN107797754B (en) Method and device for text replication and medium product
US20060061790A1 (en) Image forming method and apparatus
JP2006276914A (en) Translation processing method, document processing device, and program
US8712166B2 (en) Difference detecting apparatus, difference output apparatus, and medium
WO2009091210A3 (en) Method of providing e-book service utilizing text information, and a system therefor
JP6427964B2 (en) Image processing system, information processing apparatus, and program
JP2011113569A (en) Apparatus and method for extracting circumscribed rectangle of character in transplantable electronic document
JP5314195B2 (en) Natural language processing apparatus, method, and program
US9529792B2 (en) Glossary management device, glossary management system, and recording medium for glossary generation
JP2012240230A (en) Printing apparatus, printing method, printing system, and program
TW201245981A (en) Method of automatically detecting error by using language model for print
US10341507B1 (en) Identifying a foreign object in an electronic document
CN107590136B (en) Translation device, translation system, and translation method
JP2007316873A5 (en)
JP2007334539A (en) Image processor, image processing system, image processing method, and program
JP2010198283A (en) Print method
TWI438694B (en) Method for detecting printer error by using internet knowledge resource
US8643867B2 (en) Image forming apparatus, printing control method, recording medium, and data signal
JP2006188004A (en) Image forming apparatus, printing request processing method and printing request processing program
CN102833449B (en) Automatic document processing method based on multifunctional machine
JP2006172362A (en) Character processing device, character processing method, and program
JP2006192733A (en) Printing system, printing control program, method of printing, printer, printer control program, method of controlling printer, printing controller, printing controller control program, and method of controlling printing controller
JP2007299321A (en) Information processor, information processing method, information processing program and information storage medium
JP2012256203A (en) Character processing device, character processing method and program

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees