WO2022221079A3 - Inferring structure information from table images - Google Patents

Inferring structure information from table images Download PDF

Info

Publication number
WO2022221079A3
WO2022221079A3 PCT/US2022/023185 US2022023185W WO2022221079A3 WO 2022221079 A3 WO2022221079 A3 WO 2022221079A3 US 2022023185 W US2022023185 W US 2022023185W WO 2022221079 A3 WO2022221079 A3 WO 2022221079A3
Authority
WO
WIPO (PCT)
Prior art keywords
objects
image
structure information
table images
structured representation
Prior art date
Application number
PCT/US2022/023185
Other languages
French (fr)
Other versions
WO2022221079A2 (en
Inventor
J. Brandon SMOCK
Pramod Kumar Sharma
Natalia LARIOS DELGADO
Rohith Venkata PESALA
Robin Abraham
Original Assignee
Microsoft Technology Licensing, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/353,563 external-priority patent/US20220335240A1/en
Application filed by Microsoft Technology Licensing, Llc filed Critical Microsoft Technology Licensing, Llc
Priority to EP22719427.1A priority Critical patent/EP4323976A2/en
Publication of WO2022221079A2 publication Critical patent/WO2022221079A2/en
Publication of WO2022221079A3 publication Critical patent/WO2022221079A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

A computer implemented method includes rendering a document page as an image; detecting tables, columns, and other associated table objects within the image via one or more table recognition models that model objects in the image as overlapping bounding boxes; transforming the set of objects into a structured representation of the table; extracting data from the objects into the structured representation; and exporting the table into the desired output format.
PCT/US2022/023185 2021-04-15 2022-04-02 Inferring structure information from table images WO2022221079A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP22719427.1A EP4323976A2 (en) 2021-04-15 2022-04-02 Inferring structure information from table images

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163175446P 2021-04-15 2021-04-15
US63/175,446 2021-04-15
US17/353,563 2021-06-21
US17/353,563 US20220335240A1 (en) 2021-04-15 2021-06-21 Inferring Structure Information from Table Images

Publications (2)

Publication Number Publication Date
WO2022221079A2 WO2022221079A2 (en) 2022-10-20
WO2022221079A3 true WO2022221079A3 (en) 2022-11-24

Family

ID=81389036

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/023185 WO2022221079A2 (en) 2021-04-15 2022-04-02 Inferring structure information from table images

Country Status (1)

Country Link
WO (1) WO2022221079A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3570208A1 (en) * 2018-05-18 2019-11-20 Sap Se Two-dimensional document processing
WO2020264155A1 (en) * 2019-06-28 2020-12-30 Eygs Llp Apparatus and method for extracting data from lineless tables using delaunay triangulation and excess edge removal
JP6838209B1 (en) * 2019-10-31 2021-03-03 楽天株式会社 Document image analyzer, document image analysis method and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3570208A1 (en) * 2018-05-18 2019-11-20 Sap Se Two-dimensional document processing
WO2020264155A1 (en) * 2019-06-28 2020-12-30 Eygs Llp Apparatus and method for extracting data from lineless tables using delaunay triangulation and excess edge removal
JP6838209B1 (en) * 2019-10-31 2021-03-03 楽天株式会社 Document image analyzer, document image analysis method and program
US20210383106A1 (en) * 2019-10-31 2021-12-09 Rakuten Group, Inc. Document image analysis apparatus, document image analysis method and program thereof

Also Published As

Publication number Publication date
WO2022221079A2 (en) 2022-10-20

Similar Documents

Publication Publication Date Title
US9928225B2 (en) Formula detection engine
US8719029B2 (en) File format, server, viewer device for digital comic, digital comic generation device
US20200364451A1 (en) Representative document hierarchy generation
US20130283157A1 (en) Digital comic viewer device, digital comic viewing system, non-transitory recording medium having viewer program recorded thereon, and digital comic display method
US20130191732A1 (en) Fixed Format Document Conversion Engine
US10304439B2 (en) Image processing device, animation display method and computer readable medium
US20130191366A1 (en) Pattern Matching Engine
CN103902662A (en) Test question generating method based on browser
US20120017144A1 (en) Content analysis apparatus and method
EP2040383A3 (en) Information processing apparatus and encoding method
EP1793317A3 (en) Dynamic data presentation
EP3940621A3 (en) Method for automatically generating advertisement, electronic device, and computer-readable storage medium
US20140258852A1 (en) Detection and Reconstruction of Right-to-Left Text Direction, Ligatures and Diacritics in a Fixed Format Document
JP2014212476A (en) Comic image frame detection device, method and program
WO2022221079A3 (en) Inferring structure information from table images
CN104850819B (en) Information processing method and electronic equipment
CN110929479A (en) Method and device for converting PDF scanning piece, electronic equipment and storage medium
Joy et al. A prototype Malayalam to sign language automatic translator
GB2596452A (en) Systems and methods for generating documents from video content
Goudar et al. A effective communication solution for the hearing impaired persons: A novel approach using gesture and sentence formation
WO2022201236A1 (en) Server, system, image clipping method, and program
KR102283585B1 (en) Method and apparatus of updating a color code included in a image using the artificial intelligence
WO2004015588A3 (en) Electronic document processing
CN207051979U (en) A kind of audio-visual converting system of word
Belenkaia et al. Creation, presentation, capture, and replay of freehand writings in e-lecture scenarios

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22719427

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 2022719427

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022719427

Country of ref document: EP

Effective date: 20231115