CN107730511B - 一种基于基线估计的藏文历史文献文本行切分方法 - Google Patents
一种基于基线估计的藏文历史文献文本行切分方法 Download PDFInfo
- Publication number
- CN107730511B CN107730511B CN201710849135.9A CN201710849135A CN107730511B CN 107730511 B CN107730511 B CN 107730511B CN 201710849135 A CN201710849135 A CN 201710849135A CN 107730511 B CN107730511 B CN 107730511B
- Authority
- CN
- China
- Prior art keywords
- image
- tibetan
- line
- baseline
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 230000011218 segmentation Effects 0.000 title claims abstract description 38
- 238000012545 processing Methods 0.000 claims description 8
- 239000000853 adhesive Substances 0.000 claims description 4
- 230000001070 adhesive effect Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 239000003086 colorant Substances 0.000 claims description 3
- 238000012847 principal component analysis method Methods 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000000513 principal component analysis Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000005452 bending Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30176—Document
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Input (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710849135.9A CN107730511B (zh) | 2017-09-20 | 2017-09-20 | 一种基于基线估计的藏文历史文献文本行切分方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710849135.9A CN107730511B (zh) | 2017-09-20 | 2017-09-20 | 一种基于基线估计的藏文历史文献文本行切分方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107730511A CN107730511A (zh) | 2018-02-23 |
CN107730511B true CN107730511B (zh) | 2020-10-27 |
Family
ID=61206549
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710849135.9A Active CN107730511B (zh) | 2017-09-20 | 2017-09-20 | 一种基于基线估计的藏文历史文献文本行切分方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107730511B (zh) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108710601B (zh) * | 2018-05-14 | 2022-04-01 | 广州腾讯科技有限公司 | 一种文本显示方法及其设备、存储介质、电子设备 |
CN110032938B (zh) * | 2019-03-12 | 2021-02-19 | 北京汉王数字科技有限公司 | 一种藏文识别方法、装置及电子设备 |
CN113269181A (zh) * | 2020-02-14 | 2021-08-17 | 富士通株式会社 | 信息处理装置、信息处理方法及计算机可读记录介质 |
CN114842485B (zh) * | 2022-04-26 | 2023-06-27 | 北京百度网讯科技有限公司 | 一种字幕去除方法、装置及电子设备 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1570958A (zh) * | 2004-04-23 | 2005-01-26 | 清华大学 | 多字体多字号印刷体藏文字符识别方法 |
CN1741035A (zh) * | 2005-09-23 | 2006-03-01 | 清华大学 | 印刷体阿拉伯字符集文本切分方法 |
US7471826B1 (en) * | 2008-03-31 | 2008-12-30 | International Business Machines Corporation | Character segmentation by slices |
CN102930277A (zh) * | 2012-09-19 | 2013-02-13 | 上海珍岛信息技术有限公司 | 一种基于识别反馈的字符图像验证码识别方法 |
US8542926B2 (en) * | 2010-11-19 | 2013-09-24 | Microsoft Corporation | Script-agnostic text reflow for document images |
CN105354571A (zh) * | 2015-10-23 | 2016-02-24 | 中国科学院自动化研究所 | 基于曲线投影的畸变文本图像基线估计方法 |
CN106056055A (zh) * | 2016-05-24 | 2016-10-26 | 西北民族大学 | 基于部件组合的梵音藏文联机手写样本生成方法 |
-
2017
- 2017-09-20 CN CN201710849135.9A patent/CN107730511B/zh active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1570958A (zh) * | 2004-04-23 | 2005-01-26 | 清华大学 | 多字体多字号印刷体藏文字符识别方法 |
CN1741035A (zh) * | 2005-09-23 | 2006-03-01 | 清华大学 | 印刷体阿拉伯字符集文本切分方法 |
US7471826B1 (en) * | 2008-03-31 | 2008-12-30 | International Business Machines Corporation | Character segmentation by slices |
US8542926B2 (en) * | 2010-11-19 | 2013-09-24 | Microsoft Corporation | Script-agnostic text reflow for document images |
CN102930277A (zh) * | 2012-09-19 | 2013-02-13 | 上海珍岛信息技术有限公司 | 一种基于识别反馈的字符图像验证码识别方法 |
CN105354571A (zh) * | 2015-10-23 | 2016-02-24 | 中国科学院自动化研究所 | 基于曲线投影的畸变文本图像基线估计方法 |
CN106056055A (zh) * | 2016-05-24 | 2016-10-26 | 西北民族大学 | 基于部件组合的梵音藏文联机手写样本生成方法 |
Non-Patent Citations (2)
Title |
---|
"印刷体藏文文字识别技术研究";欧珠等;《计算机工程与应用》;20090821;第45卷(第24期);全文 * |
"基于几何形状分析的藏文字符识别";周纬等;《第五届全国几何设计与计算学术会》;20111111;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN107730511A (zh) | 2018-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107730511B (zh) | 一种基于基线估计的藏文历史文献文本行切分方法 | |
Antonacopoulos et al. | ICDAR 2009 page segmentation competition | |
US8908961B2 (en) | System and methods for arabic text recognition based on effective arabic text feature extraction | |
CN104298982B (zh) | 一种文字识别方法及装置 | |
CN102663378B (zh) | 连笔手写字符的识别方法 | |
CN106503711A (zh) | 一种文字识别方法 | |
CN101515325A (zh) | 基于字符切分和颜色聚类的数字视频中的字符提取方法 | |
CN108830270B (zh) | 对满文单词正确分割各识别的满文单词中轴线的定位方法 | |
CN112818952B (zh) | 煤岩分界线的识别方法、装置及电子设备 | |
Van Phan et al. | Development of Nom character segmentation for collecting patterns from historical document pages | |
Valy et al. | Line segmentation approach for ancient palm leaf manuscripts using competitive learning algorithm | |
CN102136074B (zh) | 一种基于mmi的木材图像纹理分析与识别方法 | |
Kaundilya et al. | Automated text extraction from images using OCR system | |
CN116824608A (zh) | 基于目标检测技术的答题卡版面分析方法 | |
CN106778752A (zh) | 一种文字识别方法 | |
Verma et al. | Removal of obstacles in Devanagari script for efficient optical character recognition | |
Zhan et al. | A robust split-and-merge text segmentation approach for images | |
Xue | Optical character recognition | |
Aravinda et al. | Template matching method for Kannada handwritten recognition based on correlation analysis | |
Modi et al. | Text line detection and segmentation in Handwritten Gurumukhi Scripts | |
Ahmed et al. | Enhancing the character segmentation accuracy of bangla ocr using bpnn | |
CN110298350B (zh) | 一种高效的印刷体维吾尔文单词分割算法 | |
Refaey | Ruled lines detection and removal in grey level handwritten image documents | |
Hashrin et al. | Segmenting Characters from Malayalam Handwritten Documents | |
KR102064974B1 (ko) | 블럽 기반의 문자 인식 방법 및 이를 위한 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
EE01 | Entry into force of recordation of patent licensing contract | ||
EE01 | Entry into force of recordation of patent licensing contract |
Application publication date: 20180223 Assignee: Luoyang Wuhuang Peony Culture Development Co.,Ltd. Assignor: Beijing University of Technology Contract record no.: X2024980000224 Denomination of invention: A Line Segmentation Method for Tibetan Historical Literature Text Based on Baseline Estimation Granted publication date: 20201027 License type: Common License Record date: 20240105 Application publication date: 20180223 Assignee: LUOYANG PEONY HARMONY TECHNOLOGY CO.,LTD. Assignor: Beijing University of Technology Contract record no.: X2024980000181 Denomination of invention: A Line Segmentation Method for Tibetan Historical Literature Text Based on Baseline Estimation Granted publication date: 20201027 License type: Common License Record date: 20240105 |