JP2022543052A - 文書処理方法、文書処理装置、文書処理機器、コンピュータ可読記憶媒体及びコンピュータプログラム - Google Patents
文書処理方法、文書処理装置、文書処理機器、コンピュータ可読記憶媒体及びコンピュータプログラム Download PDFInfo
- Publication number
- JP2022543052A JP2022543052A JP2022506431A JP2022506431A JP2022543052A JP 2022543052 A JP2022543052 A JP 2022543052A JP 2022506431 A JP2022506431 A JP 2022506431A JP 2022506431 A JP2022506431 A JP 2022506431A JP 2022543052 A JP2022543052 A JP 2022543052A
- Authority
- JP
- Japan
- Prior art keywords
- document
- features
- processed
- type
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010610080.8 | 2020-06-29 | ||
CN202010610080.8A CN111782808A (zh) | 2020-06-29 | 2020-06-29 | 文档处理方法、装置、设备及计算机可读存储介质 |
PCT/CN2021/099799 WO2022001637A1 (zh) | 2020-06-29 | 2021-06-11 | 文档处理方法、装置、设备及计算机可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2022543052A true JP2022543052A (ja) | 2022-10-07 |
Family
ID=72760274
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2022506431A Pending JP2022543052A (ja) | 2020-06-29 | 2021-06-11 | 文書処理方法、文書処理装置、文書処理機器、コンピュータ可読記憶媒体及びコンピュータプログラム |
Country Status (4)
Country | Link |
---|---|
JP (1) | JP2022543052A (zh) |
KR (1) | KR20220031097A (zh) |
CN (1) | CN111782808A (zh) |
WO (1) | WO2022001637A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111782808A (zh) * | 2020-06-29 | 2020-10-16 | 北京市商汤科技开发有限公司 | 文档处理方法、装置、设备及计算机可读存储介质 |
CN112861757B (zh) * | 2021-02-23 | 2022-11-22 | 天津汇智星源信息技术有限公司 | 基于文本语义理解的笔录智能审核方法及电子设备 |
CN113051396B (zh) * | 2021-03-08 | 2023-11-17 | 北京百度网讯科技有限公司 | 文档的分类识别方法、装置和电子设备 |
CN113297951A (zh) * | 2021-05-20 | 2021-08-24 | 北京市商汤科技开发有限公司 | 文档处理方法、装置、设备及计算机可读存储介质 |
CN113742483A (zh) * | 2021-08-27 | 2021-12-03 | 北京百度网讯科技有限公司 | 文档分类的方法、装置、电子设备和存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000285190A (ja) * | 1999-03-31 | 2000-10-13 | Toshiba Corp | 帳票識別方法および帳票識別装置および記憶媒体 |
JP2015111467A (ja) * | 2015-03-12 | 2015-06-18 | 株式会社東芝 | 手書き文字検索装置、方法及びプログラム |
WO2019052403A1 (zh) * | 2017-09-12 | 2019-03-21 | 腾讯科技(深圳)有限公司 | 图像文本匹配模型的训练方法、双向搜索方法及相关装置 |
CN110298338A (zh) * | 2019-06-20 | 2019-10-01 | 北京易道博识科技有限公司 | 一种文档图像分类方法及装置 |
WO2020113468A1 (en) * | 2018-12-05 | 2020-06-11 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for grounding a target video clip in a video |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10354009B2 (en) * | 2016-08-24 | 2019-07-16 | Microsoft Technology Licensing, Llc | Characteristic-pattern analysis of text |
US10936970B2 (en) * | 2017-08-31 | 2021-03-02 | Accenture Global Solutions Limited | Machine learning document processing |
CN110390094B (zh) * | 2018-04-20 | 2023-05-23 | 伊姆西Ip控股有限责任公司 | 对文档进行分类的方法、电子设备和计算机程序产品 |
CN109033478B (zh) * | 2018-09-12 | 2022-08-19 | 重庆工业职业技术学院 | 一种用于搜索引擎的文本信息规律分析方法与系统 |
CN109344815B (zh) * | 2018-12-13 | 2021-08-13 | 深源恒际科技有限公司 | 一种文档图像分类方法 |
CN110008944B (zh) * | 2019-02-20 | 2024-02-13 | 平安科技(深圳)有限公司 | 基于模板匹配的ocr识别方法及装置、存储介质 |
CN110866116A (zh) * | 2019-10-25 | 2020-03-06 | 远光软件股份有限公司 | 政策文档的处理方法、装置、存储介质及电子设备 |
CN111782808A (zh) * | 2020-06-29 | 2020-10-16 | 北京市商汤科技开发有限公司 | 文档处理方法、装置、设备及计算机可读存储介质 |
-
2020
- 2020-06-29 CN CN202010610080.8A patent/CN111782808A/zh active Pending
-
2021
- 2021-06-11 JP JP2022506431A patent/JP2022543052A/ja active Pending
- 2021-06-11 WO PCT/CN2021/099799 patent/WO2022001637A1/zh active Application Filing
- 2021-06-11 KR KR1020227004409A patent/KR20220031097A/ko not_active Application Discontinuation
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000285190A (ja) * | 1999-03-31 | 2000-10-13 | Toshiba Corp | 帳票識別方法および帳票識別装置および記憶媒体 |
JP2015111467A (ja) * | 2015-03-12 | 2015-06-18 | 株式会社東芝 | 手書き文字検索装置、方法及びプログラム |
WO2019052403A1 (zh) * | 2017-09-12 | 2019-03-21 | 腾讯科技(深圳)有限公司 | 图像文本匹配模型的训练方法、双向搜索方法及相关装置 |
WO2020113468A1 (en) * | 2018-12-05 | 2020-06-11 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for grounding a target video clip in a video |
CN110298338A (zh) * | 2019-06-20 | 2019-10-01 | 北京易道博识科技有限公司 | 一种文档图像分类方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
KR20220031097A (ko) | 2022-03-11 |
WO2022001637A1 (zh) | 2022-01-06 |
CN111782808A (zh) | 2020-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2022543052A (ja) | 文書処理方法、文書処理装置、文書処理機器、コンピュータ可読記憶媒体及びコンピュータプログラム | |
CN107209861B (zh) | 使用否定数据优化多类别多媒体数据分类 | |
US10558885B2 (en) | Determination method and recording medium | |
Kalsum et al. | Emotion recognition from facial expressions using hybrid feature descriptors | |
US9864928B2 (en) | Compact and robust signature for large scale visual search, retrieval and classification | |
Kouw et al. | Feature-level domain adaptation | |
US10013637B2 (en) | Optimizing multi-class image classification using patch features | |
Oliveira et al. | Automatic graphic logo detection via fast region-based convolutional networks | |
US8606022B2 (en) | Information processing apparatus, method and program | |
US20200065573A1 (en) | Generating variations of a known shred | |
Gao et al. | The labeled multiple canonical correlation analysis for information fusion | |
CN105631466B (zh) | 图像分类的方法及装置 | |
US20170076152A1 (en) | Determining a text string based on visual features of a shred | |
CN111324874B (zh) | 一种证件真伪识别方法及装置 | |
CN111340057B (zh) | 一种分类模型训练的方法及装置 | |
Sharma et al. | Multimodal classification using feature level fusion and SVM | |
JP2004178569A (ja) | データ分類装置、物体認識装置、データ分類方法及び物体認識方法 | |
Duan | Characters recognition of binary image using KNN | |
CN112380369B (zh) | 图像检索模型的训练方法、装置、设备和存储介质 | |
Barbosa et al. | Automatic voice recognition system based on multiple Support Vector Machines and mel-frequency cepstral coefficients | |
Kim et al. | An improved license plate recognition technique in outdoor image | |
CN113297951A (zh) | 文档处理方法、装置、设备及计算机可读存储介质 | |
CN110852206A (zh) | 一种联合全局特征和局部特征的场景识别方法及装置 | |
US20140119641A1 (en) | Character recognition apparatus, character recognition method, and computer-readable medium | |
JP2007188190A (ja) | パターン認識装置、パターン認識方法、パターン認識プログラム、および記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20220131 |
|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20220131 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20221115 |
|
A02 | Decision of refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A02 Effective date: 20230613 |