DE602004006682D1 - Extraction of metadata from marked areas of a document - Google Patents

Extraction of metadata from marked areas of a document

Info

Publication number
DE602004006682D1
DE602004006682D1 DE602004006682T DE602004006682T DE602004006682D1 DE 602004006682 D1 DE602004006682 D1 DE 602004006682D1 DE 602004006682 T DE602004006682 T DE 602004006682T DE 602004006682 T DE602004006682 T DE 602004006682T DE 602004006682 D1 DE602004006682 D1 DE 602004006682D1
Authority
DE
Germany
Prior art keywords
metadata
image
pixels
document
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
DE602004006682T
Other languages
German (de)
Other versions
DE602004006682T2 (en
Inventor
Jodocus F Jager
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Production Printing Netherlands BV
Original Assignee
Oce Technologies BV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oce Technologies BV filed Critical Oce Technologies BV
Publication of DE602004006682D1 publication Critical patent/DE602004006682D1/en
Application granted granted Critical
Publication of DE602004006682T2 publication Critical patent/DE602004006682T2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/416Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)
  • User Interface Of Digital Computer (AREA)
  • Character Input (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

A method and device are described for extracting metadata from an image (13) of pixels, such as a title or author of a document. At least part of the image is shown on a display (12) for a user. A pointing control element in a user interface, such as a mouse or a touch screen, is operated by a user to generate a selection command. The selection command includes a selection point in a metadata element (11) in the image. A region of foreground pixels is determined, the region containing pixels that are connected to the selection point. An extraction area (14) is constructed around the region. Finally metadata is extracted by processing pixels in the extraction area. <IMAGE>
DE602004006682T 2003-08-20 2004-08-13 Extraction of metadata from marked areas of a document Active DE602004006682T2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP03077643 2003-08-20
EP03077643 2003-08-20

Publications (2)

Publication Number Publication Date
DE602004006682D1 true DE602004006682D1 (en) 2007-07-12
DE602004006682T2 DE602004006682T2 (en) 2008-01-31

Family

ID=34178536

Family Applications (1)

Application Number Title Priority Date Filing Date
DE602004006682T Active DE602004006682T2 (en) 2003-08-20 2004-08-13 Extraction of metadata from marked areas of a document

Country Status (6)

Country Link
US (1) US7756332B2 (en)
EP (1) EP1510962B1 (en)
JP (2) JP4970714B2 (en)
CN (2) CN100382096C (en)
AT (1) ATE363700T1 (en)
DE (1) DE602004006682T2 (en)

Families Citing this family (51)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10342594B4 (en) 2003-09-15 2005-09-15 Océ Document Technologies GmbH Method and system for collecting data from a plurality of machine readable documents
US20060004745A1 (en) * 2004-06-04 2006-01-05 Agfa Corporation Structured reporting report data manager
US7475336B2 (en) * 2004-08-11 2009-01-06 Kabushiki Kaisha Toshiba Document information processing apparatus and document information processing program
JP4536461B2 (en) * 2004-09-06 2010-09-01 株式会社沖データ Image processing device
US8495061B1 (en) * 2004-09-29 2013-07-23 Google Inc. Automatic metadata identification
EP1729235A1 (en) * 2005-06-03 2006-12-06 Agfa Corporation Structured reporting report data manager
US20060290789A1 (en) * 2005-06-22 2006-12-28 Nokia Corporation File naming with optical character recognition
DE102005032046A1 (en) * 2005-07-08 2007-01-11 Océ Document Technologies GmbH A method, system, and computer program product for transferring data from a document application to a data application
US20070035780A1 (en) * 2005-08-02 2007-02-15 Kabushiki Kaisha Toshiba System and method for defining characteristic data of a scanned document
US7765184B2 (en) * 2005-09-22 2010-07-27 Nokia Corporation Metadata triggered notification for content searching
JP4856925B2 (en) 2005-10-07 2012-01-18 株式会社リコー Image processing apparatus, image processing method, and image processing program
JP2007249429A (en) * 2006-03-14 2007-09-27 Ricoh Co Ltd Email editing device, image forming device, email editing method, and program making computer execute the method
JP5078413B2 (en) * 2006-04-17 2012-11-21 株式会社リコー Image browsing system
US10380231B2 (en) * 2006-05-24 2019-08-13 International Business Machines Corporation System and method for dynamic organization of information sets
US8768983B2 (en) * 2006-10-04 2014-07-01 International Business Machines Corporation Dynamic configuration of multiple sources and source types in a business process
US20080162602A1 (en) * 2006-12-28 2008-07-03 Google Inc. Document archiving system
JP4501016B2 (en) * 2007-03-22 2010-07-14 村田機械株式会社 Document reader
EP2015554B1 (en) 2007-07-13 2012-05-16 Ricoh Company, Ltd. User interface generating method, image forming apparatus, and computer program product
US8144988B2 (en) * 2007-09-06 2012-03-27 Ricoh Company, Ltd. Document-image-data providing system, document-image-data providing device, information processing device, document-image-data providing method, information processing method, document-image-data providing program, and information processing program
US8194982B2 (en) * 2007-09-18 2012-06-05 Ricoh Company, Ltd. Document-image-data providing system, document-image-data providing device, information processing device, document-image-data providing method, information processing method, document-image-data providing program, and information processing program
US8510312B1 (en) * 2007-09-28 2013-08-13 Google Inc. Automatic metadata identification
US8009316B2 (en) * 2007-10-26 2011-08-30 Ricoh Production Print Solutions LLC Methods and apparatus for efficient sheetside bitmap processing using meta-data information
JP4604100B2 (en) * 2008-03-21 2010-12-22 シャープ株式会社 Image processing method, image processing apparatus, image forming apparatus, program, and storage medium
JP4909311B2 (en) 2008-03-31 2012-04-04 富士通フロンテック株式会社 Character recognition device
KR101023309B1 (en) 2008-03-31 2011-03-18 후지츠 프론테크 가부시키가이샤 Character recognizing apparatus
CN101577832B (en) * 2008-05-06 2012-03-21 联咏科技股份有限公司 Image processing circuit and image processing method for strengthening character display effect
US20090279127A1 (en) * 2008-05-08 2009-11-12 Infoprint Solutions Company Llc Mechanism for data extraction of variable positioned data
TWI423052B (en) * 2008-07-04 2014-01-11 Hon Hai Prec Ind Co Ltd System and method for scanning of the database
US8682072B2 (en) * 2008-12-30 2014-03-25 Yahoo! Inc. Image segmentation
JP2010252266A (en) * 2009-04-20 2010-11-04 Olympus Imaging Corp Image arrangement apparatus
JP5340847B2 (en) * 2009-07-27 2013-11-13 株式会社日立ソリューションズ Document data processing device
US8542198B2 (en) * 2009-08-24 2013-09-24 Xerox Corporation Multi-touch input actual-size display screen for scanned items
KR101164353B1 (en) * 2009-10-23 2012-07-09 삼성전자주식회사 Method and apparatus for browsing and executing media contents
KR20120033718A (en) * 2010-09-30 2012-04-09 삼성전자주식회사 Image forming apparatus and method for sending e-mail thereof
CN102147684B (en) * 2010-11-30 2014-04-23 广东威创视讯科技股份有限公司 Screen scanning method for touch screen and system thereof
CN102253746B (en) 2011-06-23 2017-05-03 中兴通讯股份有限公司 Information processing method and equipment for electronic equipment with touch screen
CN102855264B (en) * 2011-07-01 2015-11-25 富士通株式会社 Document processing method and device thereof
US9292537B1 (en) 2013-02-23 2016-03-22 Bryant Christopher Lee Autocompletion of filename based on text in a file to be saved
JP2014174923A (en) * 2013-03-12 2014-09-22 Ricoh Co Ltd Document processor, document processing method, and document processing program
JP6163839B2 (en) 2013-04-09 2017-07-19 富士通株式会社 Electronic equipment and copy control program
US10325511B2 (en) 2015-01-30 2019-06-18 Conduent Business Services, Llc Method and system to attribute metadata to preexisting documents
US20170039683A1 (en) * 2015-08-06 2017-02-09 Fuji Xerox Co., Ltd. Image processing apparatus, image processing method, image processing system, and non-transitory computer readable medium
US10810240B2 (en) 2015-11-06 2020-10-20 RedShred LLC Automatically assessing structured data for decision making
CN107229932B (en) * 2016-03-25 2021-05-28 阿里巴巴集团控股有限公司 Image text recognition method and device
JP6891073B2 (en) * 2017-08-22 2021-06-18 キヤノン株式会社 A device for setting a file name, etc. on a scanned image, its control method, and a program.
JP7043929B2 (en) * 2018-03-29 2022-03-30 株式会社リコー Information processing system and information processing method
US11210507B2 (en) 2019-12-11 2021-12-28 Optum Technology, Inc. Automated systems and methods for identifying fields and regions of interest within a document image
US11227153B2 (en) 2019-12-11 2022-01-18 Optum Technology, Inc. Automated systems and methods for identifying fields and regions of interest within a document image
JP7434001B2 (en) 2020-03-13 2024-02-20 キヤノン株式会社 Information processing device, program, information processing method
CN111949230B (en) * 2020-08-10 2022-07-05 智业软件股份有限公司 LibreOffice document-based overlay printing method, terminal device and storage medium
US11960816B2 (en) 2021-01-15 2024-04-16 RedShred LLC Automatic document generation and segmentation system

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04276885A (en) * 1991-03-04 1992-10-01 Sumitomo Electric Ind Ltd Character segmenting appartus
JPH04276855A (en) 1991-03-04 1992-10-01 Nippon Telegr & Teleph Corp <Ntt> Document filing system
JP3285686B2 (en) * 1993-06-29 2002-05-27 株式会社リコー Area division method
JPH08166959A (en) * 1994-12-12 1996-06-25 Canon Inc Picture processing method
JPH09128479A (en) * 1995-11-01 1997-05-16 Ricoh Co Ltd Method and device for dividing area
US5761686A (en) * 1996-06-27 1998-06-02 Xerox Corporation Embedding encoded information in an iconic version of a text image
CN1266646C (en) * 1997-09-19 2006-07-26 王永民 Visiting card manager and operation system thereof
JP3773642B2 (en) * 1997-12-18 2006-05-10 株式会社東芝 Image processing apparatus and image forming apparatus
AUPP400998A0 (en) * 1998-06-10 1998-07-02 Canon Kabushiki Kaisha Face detection in digital images
US6353823B1 (en) * 1999-03-08 2002-03-05 Intel Corporation Method and system for using associative metadata
JP2001084332A (en) * 1999-09-10 2001-03-30 Toshiba Corp Reader and reading method
US6360951B1 (en) * 1999-12-16 2002-03-26 Xerox Corporation Hand-held scanning system for heuristically organizing scanned information
FR2806814B1 (en) * 2000-03-22 2006-02-03 Oce Ind Sa METHOD OF RECOGNIZING AND INDEXING DOCUMENTS
NL1015943C2 (en) 2000-08-16 2002-02-19 Ocu Technologies B V Interpretation of colored documents.
KR100411894B1 (en) * 2000-12-28 2003-12-24 한국전자통신연구원 Method for Region Analysis of Documents
US6804684B2 (en) * 2001-05-07 2004-10-12 Eastman Kodak Company Method for associating semantic information with multiple images in an image database environment
EP1256900A1 (en) * 2001-05-09 2002-11-13 Requisite Technology Inc. Database entry system and method employing optical character recognition
US7432940B2 (en) * 2001-10-12 2008-10-07 Canon Kabushiki Kaisha Interactive animation of sprites in a video production
US7043474B2 (en) * 2002-04-15 2006-05-09 International Business Machines Corporation System and method for measuring image similarity based on semantic meaning
US7050629B2 (en) * 2002-05-31 2006-05-23 Intel Corporation Methods and systems to index and retrieve pixel data
GB2399245B (en) * 2003-03-03 2005-07-27 Motorola Inc Method for segmenting an image and an image transmission system and image transmission unit therefor
US7236632B2 (en) * 2003-04-11 2007-06-26 Ricoh Company, Ltd. Automated techniques for comparing contents of images

Also Published As

Publication number Publication date
EP1510962B1 (en) 2007-05-30
JP4970714B2 (en) 2012-07-11
DE602004006682T2 (en) 2008-01-31
JP2005071349A (en) 2005-03-17
US7756332B2 (en) 2010-07-13
CN1604120A (en) 2005-04-06
JP2012053911A (en) 2012-03-15
US20050041860A1 (en) 2005-02-24
CN100476859C (en) 2009-04-08
CN100382096C (en) 2008-04-16
CN1839396A (en) 2006-09-27
ATE363700T1 (en) 2007-06-15
EP1510962A1 (en) 2005-03-02

Similar Documents

Publication Publication Date Title
DE602004006682D1 (en) Extraction of metadata from marked areas of a document
ATE356389T1 (en) DOCUMENT SCANNER
CN1251056C (en) System and methods for manipulating and viewing user interface of digital data
Arai et al. PaperLink: a technique for hyperlinking from real paper to electronic content
EP2306270B1 (en) Character input method and system
US20140361083A1 (en) Two Dimensional-Code Scanning Method and Device
JP6010253B2 (en) Electronic device, method and program
TW200603007A (en) Apparatus and method for handwriting recognition
KR20100051648A (en) Method for manipulating regions of a digital image
JPH07141101A (en) Input system using picture
AU2003283447A1 (en) Method and user interface for entering characters
EP1416426A3 (en) Handwritten character input device, program and method
SE0104041L (en) Electronic pen and method for recording handwritten information
US6847386B2 (en) Visual cue for on-screen scrolling
CN109074223A (en) For carrying out the method and system of character insertion in character string
US8787670B2 (en) Software for text and image edit recognition for editing of images that contain text
CN101354789A (en) Method and device for implementing image face mask specific effect
CN102053949A (en) Method and device for processing uncommon words
KR19990045918A (en) Method and apparatus for providing pointer implemented with image function
CN103235836A (en) Method for inputting information through mobile phone
EP1701292A3 (en) Document layout analysis with control of non-character area
JP2015114955A (en) Information processing apparatus, information processing method, and program
WO2015107692A1 (en) Electronic device and method for handwriting
WO2014083878A1 (en) Information processing device and program
JP2009294848A (en) Information display and program

Legal Events

Date Code Title Description
8364 No opposition during term of opposition