ATE375561T1 - Verfahren zur identifizierung von redundantem text in elektronischen dokumenten - Google Patents
Verfahren zur identifizierung von redundantem text in elektronischen dokumentenInfo
- Publication number
- ATE375561T1 ATE375561T1 AT05012452T AT05012452T ATE375561T1 AT E375561 T1 ATE375561 T1 AT E375561T1 AT 05012452 T AT05012452 T AT 05012452T AT 05012452 T AT05012452 T AT 05012452T AT E375561 T1 ATE375561 T1 AT E375561T1
- Authority
- AT
- Austria
- Prior art keywords
- text
- redundant
- text fragments
- page
- candidates
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Document Processing Apparatus (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP05012452A EP1732012B1 (de) | 2005-06-09 | 2005-06-09 | Verfahren zur Identifizierung von redundantem Text in elektronischen Dokumenten |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| ATE375561T1 true ATE375561T1 (de) | 2007-10-15 |
Family
ID=35149042
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AT05012452T ATE375561T1 (de) | 2005-06-09 | 2005-06-09 | Verfahren zur identifizierung von redundantem text in elektronischen dokumenten |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US7643682B2 (de) |
| EP (1) | EP1732012B1 (de) |
| AT (1) | ATE375561T1 (de) |
| DE (1) | DE602005002835T2 (de) |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4345772B2 (ja) * | 2006-04-21 | 2009-10-14 | セイコーエプソン株式会社 | 文書編集装置、プログラムおよび記憶媒体 |
| US20090199087A1 (en) * | 2008-02-04 | 2009-08-06 | Microsoft Corporation | Applying rich visual effects to arbitrary runs of text |
| US9063911B2 (en) * | 2009-01-02 | 2015-06-23 | Apple Inc. | Identification of layout and content flow of an unstructured document |
| CN101937312B (zh) * | 2010-09-15 | 2014-03-19 | 中兴通讯股份有限公司 | 一种电子书的标记方法及移动终端 |
| CN101976232B (zh) * | 2010-09-19 | 2012-06-20 | 深圳市万兴软件有限公司 | 一种识别文档中数据表格的方法及装置 |
| US9471550B2 (en) * | 2012-10-16 | 2016-10-18 | Linkedin Corporation | Method and apparatus for document conversion with font metrics adjustment for format compatibility |
| US9563635B2 (en) | 2013-10-28 | 2017-02-07 | International Business Machines Corporation | Automated recognition of patterns in a log file having unknown grammar |
| US10373343B1 (en) | 2015-05-28 | 2019-08-06 | Certainteed Corporation | System for visualization of a building material |
| JP6744571B2 (ja) * | 2016-06-22 | 2020-08-19 | 富士ゼロックス株式会社 | 情報処理装置およびプログラム |
| JP6797610B2 (ja) * | 2016-08-31 | 2020-12-09 | キヤノン株式会社 | 装置、方法、及びプログラム |
| US11195324B1 (en) | 2018-08-14 | 2021-12-07 | Certainteed Llc | Systems and methods for visualization of building structures |
| CN113298079B (zh) * | 2021-06-28 | 2023-10-27 | 北京奇艺世纪科技有限公司 | 一种图像处理方法、装置、电子设备及存储介质 |
Family Cites Families (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5168147A (en) * | 1990-07-31 | 1992-12-01 | Xerox Corporation | Binary image processing for decoding self-clocking glyph shape codes |
| US5321773A (en) * | 1991-12-10 | 1994-06-14 | Xerox Corporation | Image recognition method using finite state networks |
| US6336124B1 (en) * | 1998-10-01 | 2002-01-01 | Bcl Computers, Inc. | Conversion data representing a document to other formats for manipulation and display |
| US6641053B1 (en) * | 2002-10-16 | 2003-11-04 | Xerox Corp. | Foreground/background document processing with dataglyphs |
-
2005
- 2005-06-09 AT AT05012452T patent/ATE375561T1/de active
- 2005-06-09 DE DE602005002835T patent/DE602005002835T2/de not_active Expired - Lifetime
- 2005-06-09 EP EP05012452A patent/EP1732012B1/de not_active Expired - Lifetime
-
2006
- 2006-04-18 US US11/405,771 patent/US7643682B2/en active Active
Also Published As
| Publication number | Publication date |
|---|---|
| DE602005002835T2 (de) | 2008-02-07 |
| DE602005002835D1 (de) | 2007-11-22 |
| EP1732012B1 (de) | 2007-10-10 |
| US20060282769A1 (en) | 2006-12-14 |
| US7643682B2 (en) | 2010-01-05 |
| EP1732012A1 (de) | 2006-12-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| ATE373274T1 (de) | Verfahren zur identifizierung von wörtern in einem elektronischen dokument | |
| DE60336146D1 (de) | Fontsystem und verfahren mit skalierbarem strich | |
| ATE375561T1 (de) | Verfahren zur identifizierung von redundantem text in elektronischen dokumenten | |
| ATE392667T1 (de) | Verfahren und computersystem zum indexieren strukturierter dokumente | |
| US20100199168A1 (en) | Document Generation Method and System | |
| Meunier | Optimized XY-cut for determining a page reading order | |
| CN108268884B (zh) | 一种文档对比方法及装置 | |
| WO2007038389A3 (en) | Method and apparatus for identifying and classifying network documents as spam | |
| EP1079312A3 (de) | Druckwerk mit Textdaten und Verfahren und Apparat zum Ausdrucken des Druckwerks | |
| US9430451B1 (en) | Parsing author name groups in non-standardized format | |
| US10984168B1 (en) | System and method for generating a multi-modal abstract | |
| JP6976524B2 (ja) | 印刷用データの生成方法及び印刷用データを生成するためのソフトウェア | |
| CN120911402B (zh) | 基于ai和排版分析的dtp业务复杂度辅助评估方法 | |
| JP6204076B2 (ja) | 文章領域読み取り順序判定装置、文章領域読み取り順序判定方法及び文章領域読み取り順序判定プログラム | |
| Arrant | Standard Tiberian Pronunciation in a Non-Standard Form: TS as 64.206 | |
| Doboš | The Tale of Two Empires | |
| O'CONNOR | Handwritten Text Recognition technology and MS Turin, BNU, L. II. 14 (T). The" Rescapé" case study. | |
| Pournader | Proposal to encode four combining Arabic characters for Koranic use | |
| Kumar | Publisher’s Information | |
| CN100565513C (zh) | 文件处理方法及其相关的图案显示方法 | |
| Pandey | Final proposal to encode Nandinagari in Unicode | |
| Kumar | Publisher’s Information | |
| Irie et al. | Authors’ Instructions for IVCNZ08 | |
| de Normalisation | Background information | |
| Selamat et al. | Export competitiveness of the Malaysia processed food in the middle east market |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| UEP | Publication of translation of european patent specification |
Ref document number: 1732012 Country of ref document: EP |