CN109670502A - 一种基于维语文字识别的训练数据生成系统及方法 - Google Patents
一种基于维语文字识别的训练数据生成系统及方法 Download PDFInfo
- Publication number
- CN109670502A CN109670502A CN201811549818.3A CN201811549818A CN109670502A CN 109670502 A CN109670502 A CN 109670502A CN 201811549818 A CN201811549818 A CN 201811549818A CN 109670502 A CN109670502 A CN 109670502A
- Authority
- CN
- China
- Prior art keywords
- module
- training data
- ocr
- engine
- generates
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/109—Font handling; Temporal or kinetic typography
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/243—Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Discrimination (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811549818.3A CN109670502A (zh) | 2018-12-18 | 2018-12-18 | 一种基于维语文字识别的训练数据生成系统及方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811549818.3A CN109670502A (zh) | 2018-12-18 | 2018-12-18 | 一种基于维语文字识别的训练数据生成系统及方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109670502A true CN109670502A (zh) | 2019-04-23 |
Family
ID=66143955
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811549818.3A Pending CN109670502A (zh) | 2018-12-18 | 2018-12-18 | 一种基于维语文字识别的训练数据生成系统及方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109670502A (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111783881A (zh) * | 2020-07-01 | 2020-10-16 | 上海天壤智能科技有限公司 | 基于预训练模型的场景适配学习方法及系统 |
CN112418224A (zh) * | 2021-01-22 | 2021-02-26 | 成都无糖信息技术有限公司 | 一种基于机器学习的通用ocr的训练数据生成系统及方法 |
CN112488114A (zh) * | 2020-11-13 | 2021-03-12 | 宁波多牛大数据网络技术有限公司 | 一种图片合成方法及装置、文字识别系统 |
CN114998909A (zh) * | 2022-06-08 | 2022-09-02 | 北京云上曲率科技有限公司 | 一种图像文字语种识别方法及系统 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180101726A1 (en) * | 2016-10-10 | 2018-04-12 | Insurance Services Office Inc. | Systems and Methods for Optical Character Recognition for Low-Resolution Documents |
CN108090400A (zh) * | 2016-11-23 | 2018-05-29 | 中移(杭州)信息技术有限公司 | 一种图像文本识别的方法和装置 |
CN108154148A (zh) * | 2018-01-22 | 2018-06-12 | 厦门美亚商鼎信息科技有限公司 | 训练样本的人工合成方法及基于该样本的验证码识别方法 |
-
2018
- 2018-12-18 CN CN201811549818.3A patent/CN109670502A/zh active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180101726A1 (en) * | 2016-10-10 | 2018-04-12 | Insurance Services Office Inc. | Systems and Methods for Optical Character Recognition for Low-Resolution Documents |
CN108090400A (zh) * | 2016-11-23 | 2018-05-29 | 中移(杭州)信息技术有限公司 | 一种图像文本识别的方法和装置 |
CN108154148A (zh) * | 2018-01-22 | 2018-06-12 | 厦门美亚商鼎信息科技有限公司 | 训练样本的人工合成方法及基于该样本的验证码识别方法 |
Non-Patent Citations (2)
Title |
---|
RYOSUKE ODATE ET AL.: ""FAST AND ACCURATE CANDIDATE REDUCTION USING THE MULTICLASS LDA FOR JAPANESE/CHINESE CHARACTER RECOGNITION"", 《 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 * |
丁明宇 等: ""基于深度学习的图片中商品参数识别方法"", 《软件学报》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111783881A (zh) * | 2020-07-01 | 2020-10-16 | 上海天壤智能科技有限公司 | 基于预训练模型的场景适配学习方法及系统 |
CN112488114A (zh) * | 2020-11-13 | 2021-03-12 | 宁波多牛大数据网络技术有限公司 | 一种图片合成方法及装置、文字识别系统 |
CN112418224A (zh) * | 2021-01-22 | 2021-02-26 | 成都无糖信息技术有限公司 | 一种基于机器学习的通用ocr的训练数据生成系统及方法 |
CN114998909A (zh) * | 2022-06-08 | 2022-09-02 | 北京云上曲率科技有限公司 | 一种图像文字语种识别方法及系统 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109670502A (zh) | 一种基于维语文字识别的训练数据生成系统及方法 | |
KR20200014842A (ko) | 이미지 조명 방법, 장치, 전자 기기 및 저장 매체 | |
US20090016617A1 (en) | Sender dependent messaging viewer | |
CA2174258A1 (en) | Method and System for Automatic Transcription Correction | |
CN112434690A (zh) | 动态解析文本图像特征现象的元素自动捕获理解方法、系统及存储介质 | |
CN107358184A (zh) | 文档文字的提取方法及提取装置 | |
CN110554991A (zh) | 一种文本图片的矫正与管理方法 | |
KR20090089793A (ko) | 전자 문서 생성장치, 전자 문서 생성방법, 및 기억매체 | |
CN103854019A (zh) | 图像中的字段提取方法及装置 | |
CN113592735A (zh) | 文本页面图像还原方法及系统、电子设备和计算机可读介质 | |
CN112036406A (zh) | 一种图像文档的文本抽取方法、装置及电子设备 | |
CN109657619A (zh) | 一种附图翻译方法、装置及存储介质 | |
CN113239707A (zh) | 文本翻译方法、文本翻译装置及存储介质 | |
CN110309517B (zh) | 表情文案处理方法、装置、系统及存储介质 | |
CN111881900A (zh) | 语料生成、翻译模型训练、翻译方法、装置、设备及介质 | |
US20110205430A1 (en) | Caption movement processing apparatus and method | |
CN114612912A (zh) | 基于智能语料库的图像文字识别方法、系统及设备 | |
CN112836467B (zh) | 一种图像处理方法及装置 | |
CN111241845B (zh) | 一种基于语义匹配方法的财务科目自动识别方法及装置 | |
CN113435426B (zh) | 用于ocr识别的数据增广方法、装置、设备及存储介质 | |
CN114120334A (zh) | 盲文处理方法、装置、存储介质及电子装置 | |
JP5604276B2 (ja) | 文書画像生成装置および文書画像生成方法 | |
CN107609195A (zh) | 一种搜题方法及装置 | |
CN115830612A (zh) | 一种ocr训练数据的生成方法、装置、设备及存储介质 | |
CN110543238A (zh) | 基于人工智能的桌面交互方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220517 Address after: 518000 22nd floor, building C, Shenzhen International Innovation Center (Futian science and Technology Plaza), No. 1006, Shennan Avenue, Xintian community, Huafu street, Futian District, Shenzhen, Guangdong Province Applicant after: Shenzhen wanglian Anrui Network Technology Co.,Ltd. Address before: Floor 4-8, unit 5, building 1, 333 Yunhua Road, high tech Zone, Chengdu, Sichuan 610041 Applicant before: CHENGDU 30KAITIAN COMMUNICATION INDUSTRY Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190423 |
|
WD01 | Invention patent application deemed withdrawn after publication |