CA3253894A1 - AUTOMATED KEY-VALUE PAIR EXTRACTION - Google Patents
AUTOMATED KEY-VALUE PAIR EXTRACTIONInfo
- Publication number
- CA3253894A1 CA3253894A1 CA3253894A CA3253894A CA3253894A1 CA 3253894 A1 CA3253894 A1 CA 3253894A1 CA 3253894 A CA3253894 A CA 3253894A CA 3253894 A CA3253894 A CA 3253894A CA 3253894 A1 CA3253894 A1 CA 3253894A1
- Authority
- CA
- Canada
- Prior art keywords
- document
- node
- key
- characters
- keys
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Creation or modification of classes or clusters
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/412—Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Character Input (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/685,328 US12154356B2 (en) | 2022-03-02 | 2022-03-02 | Automated key-value pair extraction |
| US17/685,328 | 2022-03-02 | ||
| PCT/US2023/013970 WO2023167824A1 (en) | 2022-03-02 | 2023-02-27 | Automated key-value pair extraction |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CA3253894A1 true CA3253894A1 (en) | 2023-09-07 |
Family
ID=87850837
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CA3253894A Pending CA3253894A1 (en) | 2022-03-02 | 2023-02-27 | AUTOMATED KEY-VALUE PAIR EXTRACTION |
Country Status (8)
| Country | Link |
|---|---|
| US (2) | US12154356B2 (https=) |
| EP (1) | EP4487220A4 (https=) |
| JP (1) | JP2025507838A (https=) |
| KR (1) | KR20240157071A (https=) |
| CN (1) | CN118786420A (https=) |
| AU (1) | AU2023227770A1 (https=) |
| CA (1) | CA3253894A1 (https=) |
| WO (1) | WO2023167824A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR102905894B1 (ko) * | 2025-05-13 | 2025-12-31 | 이지자산평가주식회사 | 인공지능 모델을 활용하여 테이블 데이터를 분석하기 위한 방법 및 장치 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8443278B2 (en) | 2009-01-02 | 2013-05-14 | Apple Inc. | Identification of tables in an unstructured document |
| US9645999B1 (en) | 2016-08-02 | 2017-05-09 | Quid, Inc. | Adjustment of document relationship graphs |
| US11256760B1 (en) * | 2018-09-28 | 2022-02-22 | Automation Anywhere, Inc. | Region adjacent subgraph isomorphism for layout clustering in document images |
| US10713524B2 (en) * | 2018-10-10 | 2020-07-14 | Microsoft Technology Licensing, Llc | Key value extraction from documents |
| US10878234B1 (en) * | 2018-11-20 | 2020-12-29 | Amazon Technologies, Inc. | Automated form understanding via layout agnostic identification of keys and corresponding values |
| CN114005123B (zh) | 2021-10-11 | 2024-05-24 | 北京大学 | 一种印刷体文本版面数字化重建系统及方法 |
| US12039798B2 (en) * | 2021-11-01 | 2024-07-16 | Salesforce, Inc. | Processing forms using artificial intelligence models |
-
2022
- 2022-03-02 US US17/685,328 patent/US12154356B2/en active Active
-
2023
- 2023-02-27 CA CA3253894A patent/CA3253894A1/en active Pending
- 2023-02-27 KR KR1020247032812A patent/KR20240157071A/ko active Pending
- 2023-02-27 AU AU2023227770A patent/AU2023227770A1/en active Pending
- 2023-02-27 JP JP2024551948A patent/JP2025507838A/ja active Pending
- 2023-02-27 CN CN202380024039.4A patent/CN118786420A/zh active Pending
- 2023-02-27 EP EP23763837.4A patent/EP4487220A4/en active Pending
- 2023-02-27 WO PCT/US2023/013970 patent/WO2023167824A1/en not_active Ceased
-
2024
- 2024-10-24 US US18/925,781 patent/US20250046107A1/en active Pending
Also Published As
| Publication number | Publication date |
|---|---|
| EP4487220A4 (en) | 2025-12-24 |
| CN118786420A (zh) | 2024-10-15 |
| US12154356B2 (en) | 2024-11-26 |
| WO2023167824A1 (en) | 2023-09-07 |
| AU2023227770A1 (en) | 2024-09-12 |
| US20250046107A1 (en) | 2025-02-06 |
| EP4487220A1 (en) | 2025-01-08 |
| KR20240157071A (ko) | 2024-10-31 |
| JP2025507838A (ja) | 2025-03-21 |
| US20230282013A1 (en) | 2023-09-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11869263B2 (en) | Automated classification and interpretation of life science documents | |
| CN106104570B (zh) | 检测和提取图像文档组件来创建流文档 | |
| JP6827116B2 (ja) | ウェブページのクラスタリング方法及び装置 | |
| US11256912B2 (en) | Electronic form identification using spatial information | |
| US12248794B2 (en) | Self-supervised system for learning a user interface language | |
| KR102682244B1 (ko) | Esg 보조 툴을 이용하여 정형화된 esg 데이터로 기계학습 모델을 학습하는 방법 및 기계학습 모델로 자동완성된 esg 문서를 생성하는 서비스 서버 | |
| EP4302227A1 (en) | System and method for automated document analysis | |
| EP3175375A1 (en) | Image based search to identify objects in documents | |
| US20250046107A1 (en) | Automated key-value pair extraction | |
| US11687578B1 (en) | Systems and methods for classification of data streams | |
| CA3254041A1 (en) | MODULAR VECTOR SYSTEM (MODVEC): PLATFORM FOR BUILDING NEXT-GENERATION EXPRESSION VECTORS | |
| CN116560819A (zh) | 基于rpa的批量自动化操作方法、系统、设备及储存介质 | |
| CN113920509A (zh) | 目标页面展示方法、装置、计算机设备及存储介质 | |
| US12437008B1 (en) | Resolving latent status from dense information using machine learning | |
| US12536679B2 (en) | Application matching method and application matching device | |
| US20240233426A9 (en) | Method of classifying a document for a straight-through processing | |
| CN118968187A (zh) | 基于多粒度模型的图片违规检测方法、装置、系统及介质 | |
| CN117370817A (zh) | 数据处理方法、装置、设备、介质和程序产品 | |
| CN117215947A (zh) | 一种页面白屏检测方法、装置、计算机设备及存储介质 | |
| Vujic | Quality Assurance Workflow, Release 2+ Release Report |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A00 | Application filed |
Free format text: ST27 STATUS EVENT CODE: A-0-1-A10-A00-A101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: APPLICATION RECEIVED - PCT Effective date: 20240830 |
|
| A00 | Application filed |
Free format text: ST27 STATUS EVENT CODE: A-1-1-A10-A00-A102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: COMPLIANCE REQUIREMENTS DETERMINED MET Effective date: 20250121 |
|
| A15 | Pct application entered into the national or regional phase |
Free format text: ST27 STATUS EVENT CODE: A-1-1-A10-A15-X000 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: NATIONAL ENTRY REQUIREMENTS DETERMINED COMPLIANT Effective date: 20250121 |
|
| P18 | Priority claim added or amended |
Free format text: ST27 STATUS EVENT CODE: A-1-1-P10-P18-P105 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: PRIORITY CLAIM REQUIREMENTS DETERMINED COMPLIANT Effective date: 20250121 |
|
| W00 | Other event occurred |
Free format text: ST27 STATUS EVENT CODE: A-1-1-W10-W00-W100 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: LETTER SENT Effective date: 20250131 |
|
| MFA | Maintenance fee for application paid |
Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 2ND ANNIV.) - STANDARD Year of fee payment: 2 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20250227 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT DETERMINED COMPLIANT Effective date: 20250227 Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL Effective date: 20250227 |
|
| R00 | Party data change recorded |
Free format text: ST27 STATUS EVENT CODE: A-1-1-R10-R00-R113 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CHANGE OF ADDRESS OR METHOD OF CORRESPONDENCE REQUEST RECEIVED Effective date: 20250406 |
|
| R18 | Changes to party contact information recorded |
Free format text: ST27 STATUS EVENT CODE: A-1-1-R10-R18-R114 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CHANGE OF METHOD OF CORRESPONDENCE REQUIREMENTS DETERMINED COMPLIANT Effective date: 20251210 Free format text: ST27 STATUS EVENT CODE: A-1-1-R10-R18-R143 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CHANGE OF ADDRESS REQUIREMENTS DETERMINED COMPLIANT Effective date: 20251210 |
|
| W00 | Other event occurred |
Free format text: ST27 STATUS EVENT CODE: A-1-1-W10-W00-W111 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: CORRESPONDENT DETERMINED COMPLIANT Effective date: 20251210 |
|
| MFA | Maintenance fee for application paid |
Free format text: FEE DESCRIPTION TEXT: MF (APPLICATION, 3RD ANNIV.) - STANDARD Year of fee payment: 3 |
|
| U00 | Fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U00-U101 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE REQUEST RECEIVED Effective date: 20260219 |
|
| U11 | Full renewal or maintenance fee paid |
Free format text: ST27 STATUS EVENT CODE: A-1-1-U10-U11-U102 (AS PROVIDED BY THE NATIONAL OFFICE); EVENT TEXT: MAINTENANCE FEE PAYMENT PAID IN FULL Effective date: 20260219 |