KR20220122566A - 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 - Google Patents
텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 Download PDFInfo
- Publication number
- KR20220122566A KR20220122566A KR1020220101802A KR20220101802A KR20220122566A KR 20220122566 A KR20220122566 A KR 20220122566A KR 1020220101802 A KR1020220101802 A KR 1020220101802A KR 20220101802 A KR20220101802 A KR 20220101802A KR 20220122566 A KR20220122566 A KR 20220122566A
- Authority
- KR
- South Korea
- Prior art keywords
- image
- text
- training
- target
- recognition
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 197
- 238000000034 method Methods 0.000 title claims abstract description 104
- 230000015654 memory Effects 0.000 claims description 25
- 238000004590 computer program Methods 0.000 claims description 24
- 238000000605 extraction Methods 0.000 claims description 5
- 108010001267 Protein Subunits Proteins 0.000 claims 2
- 238000005516 engineering process Methods 0.000 abstract description 14
- 238000012015 optical character recognition Methods 0.000 abstract description 11
- 238000013135 deep learning Methods 0.000 abstract description 5
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 description 16
- 238000001514 detection method Methods 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 7
- 230000000306 recurrent effect Effects 0.000 description 7
- 238000002372 labelling Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000001502 supplementing effect Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 101000822695 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C1 Proteins 0.000 description 1
- 101000655262 Clostridium perfringens (strain 13 / Type A) Small, acid-soluble spore protein C2 Proteins 0.000 description 1
- 101000655256 Paraclostridium bifermentans Small, acid-soluble spore protein alpha Proteins 0.000 description 1
- 101000655264 Paraclostridium bifermentans Small, acid-soluble spore protein beta Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G06T5/004—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
- G06T5/75—Unsharp masking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/16—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/1801—Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210279539.XA CN114399769B (zh) | 2022-03-22 | 2022-03-22 | 文本识别模型的训练方法、文本识别方法及装置 |
CN202210279539.X | 2022-03-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20220122566A true KR20220122566A (ko) | 2022-09-02 |
Family
ID=81234744
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020220101802A KR20220122566A (ko) | 2022-03-22 | 2022-08-16 | 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP2022177242A (zh) |
KR (1) | KR20220122566A (zh) |
CN (2) | CN114399769B (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024063437A1 (ko) * | 2022-09-22 | 2024-03-28 | 쿠팡 주식회사 | 인공지능 모델을 관리하기 위한 방법 및 장치 |
CN117786417A (zh) * | 2024-02-28 | 2024-03-29 | 之江实验室 | 一种模型训练方法、瞬变源的识别方法、装置及电子设备 |
CN118366011A (zh) * | 2024-06-19 | 2024-07-19 | 温州电力建设有限公司 | 模型训练、地下电缆管道缺陷识别方法、产品及设备 |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114399769B (zh) * | 2022-03-22 | 2022-08-02 | 北京百度网讯科技有限公司 | 文本识别模型的训练方法、文本识别方法及装置 |
CN114863450B (zh) * | 2022-05-19 | 2023-05-16 | 北京百度网讯科技有限公司 | 图像处理方法、装置、电子设备及存储介质 |
CN114972910B (zh) * | 2022-05-20 | 2023-05-23 | 北京百度网讯科技有限公司 | 图文识别模型的训练方法、装置、电子设备及存储介质 |
CN115661829A (zh) * | 2022-10-26 | 2023-01-31 | 阿里巴巴(中国)有限公司 | 图文识别方法及图文识别模型的数据处理方法 |
WO2024108472A1 (zh) * | 2022-11-24 | 2024-05-30 | 北京京东方技术开发有限公司 | 模型训练方法及装置、文本图像处理方法、设备、介质 |
CN116012650B (zh) * | 2023-01-03 | 2024-04-23 | 北京百度网讯科技有限公司 | 文字识别模型训练及其识别方法、装置、设备和介质 |
CN116189198B (zh) * | 2023-01-06 | 2024-06-28 | 北京百度网讯科技有限公司 | 文本识别模型训练方法、文本识别方法、装置及存储介质 |
CN116229480B (zh) * | 2023-01-10 | 2024-05-28 | 北京百度网讯科技有限公司 | 文本识别模型训练方法、文本识别方法、装置及存储介质 |
CN116363663A (zh) * | 2023-04-03 | 2023-06-30 | 北京百度网讯科技有限公司 | 图像处理方法、图像识别方法及装置 |
CN116884003B (zh) * | 2023-07-18 | 2024-03-22 | 南京领行科技股份有限公司 | 图片自动标注方法、装置、电子设备及存储介质 |
CN117115306B (zh) * | 2023-08-30 | 2024-07-12 | 苏州畅行智驾汽车科技有限公司 | 一种图像生成方法、装置、电子设备及存储介质 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108549893B (zh) * | 2018-04-04 | 2020-03-31 | 华中科技大学 | 一种任意形状的场景文本端到端识别方法 |
US11093560B2 (en) * | 2018-09-21 | 2021-08-17 | Microsoft Technology Licensing, Llc | Stacked cross-modal matching |
EP3754549B1 (en) * | 2019-06-17 | 2021-09-22 | Sap Se | A computer vision method for recognizing an object category in a digital image |
CN111461203A (zh) * | 2020-03-30 | 2020-07-28 | 北京百度网讯科技有限公司 | 跨模态处理方法、装置、电子设备和计算机存储介质 |
CN112016543B (zh) * | 2020-07-24 | 2024-09-20 | 华为技术有限公司 | 一种文本识别网络、神经网络训练的方法以及相关设备 |
CN113792113A (zh) * | 2020-07-31 | 2021-12-14 | 北京京东尚科信息技术有限公司 | 视觉语言模型获得及任务处理方法、装置、设备及介质 |
CN113537186A (zh) * | 2020-12-04 | 2021-10-22 | 腾讯科技(深圳)有限公司 | 文本图像的识别方法、装置、电子设备及存储介质 |
CN112541501B (zh) * | 2020-12-18 | 2021-09-07 | 北京中科研究院 | 一种基于视觉语言建模网络的场景文字识别方法 |
CN112801085A (zh) * | 2021-02-09 | 2021-05-14 | 沈阳麟龙科技股份有限公司 | 一种图像中文字的识别方法、装置、介质及电子设备 |
CN112883953B (zh) * | 2021-02-22 | 2022-10-28 | 中国工商银行股份有限公司 | 基于联合学习的卡片识别装置及方法 |
CN113378833B (zh) * | 2021-06-25 | 2023-09-01 | 北京百度网讯科技有限公司 | 图像识别模型训练方法、图像识别方法、装置及电子设备 |
CN113435529B (zh) * | 2021-07-06 | 2023-11-07 | 北京百度网讯科技有限公司 | 模型预训练方法、模型训练方法及图像处理方法 |
CN113657390B (zh) * | 2021-08-13 | 2022-08-12 | 北京百度网讯科技有限公司 | 文本检测模型的训练方法和检测文本方法、装置和设备 |
CN113657399B (zh) * | 2021-08-18 | 2022-09-27 | 北京百度网讯科技有限公司 | 文字识别模型的训练方法、文字识别方法及装置 |
CN114120305B (zh) * | 2021-11-26 | 2023-07-07 | 北京百度网讯科技有限公司 | 文本分类模型的训练方法、文本内容的识别方法及装置 |
CN114155543B (zh) * | 2021-12-08 | 2022-11-29 | 北京百度网讯科技有限公司 | 神经网络训练方法、文档图像理解方法、装置和设备 |
CN114399769B (zh) * | 2022-03-22 | 2022-08-02 | 北京百度网讯科技有限公司 | 文本识别模型的训练方法、文本识别方法及装置 |
-
2022
- 2022-03-22 CN CN202210279539.XA patent/CN114399769B/zh active Active
- 2022-03-22 CN CN202210685043.2A patent/CN115035538B/zh active Active
- 2022-08-16 KR KR1020220101802A patent/KR20220122566A/ko unknown
- 2022-09-27 JP JP2022153452A patent/JP2022177242A/ja not_active Ceased
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024063437A1 (ko) * | 2022-09-22 | 2024-03-28 | 쿠팡 주식회사 | 인공지능 모델을 관리하기 위한 방법 및 장치 |
CN117786417A (zh) * | 2024-02-28 | 2024-03-29 | 之江实验室 | 一种模型训练方法、瞬变源的识别方法、装置及电子设备 |
CN117786417B (zh) * | 2024-02-28 | 2024-05-10 | 之江实验室 | 一种模型训练方法、瞬变源的识别方法、装置及电子设备 |
CN118366011A (zh) * | 2024-06-19 | 2024-07-19 | 温州电力建设有限公司 | 模型训练、地下电缆管道缺陷识别方法、产品及设备 |
Also Published As
Publication number | Publication date |
---|---|
JP2022177242A (ja) | 2022-11-30 |
CN115035538B (zh) | 2023-04-07 |
CN114399769B (zh) | 2022-08-02 |
CN114399769A (zh) | 2022-04-26 |
CN115035538A (zh) | 2022-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR20220122566A (ko) | 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 | |
JP7406606B2 (ja) | テキスト認識モデルの訓練方法、テキスト認識方法及び装置 | |
US11610384B2 (en) | Zero-shot object detection | |
US20240265718A1 (en) | Method of training text detection model, method of detecting text, and device | |
EP3913542A2 (en) | Method and apparatus of training model, device, medium, and program product | |
WO2022227769A1 (zh) | 车道线检测模型的训练方法、装置、电子设备及存储介质 | |
US20220415072A1 (en) | Image processing method, text recognition method and apparatus | |
US20230009547A1 (en) | Method and apparatus for detecting object based on video, electronic device and storage medium | |
CN111291882A (zh) | 一种模型转换的方法、装置、设备和计算机存储介质 | |
CN116152833B (zh) | 基于图像的表格还原模型的训练方法及表格还原方法 | |
CN114863437B (zh) | 文本识别方法、装置、电子设备和存储介质 | |
EP4191544A1 (en) | Method and apparatus for recognizing token, electronic device and storage medium | |
US20230245429A1 (en) | Method and apparatus for training lane line detection model, electronic device and storage medium | |
CN117830580A (zh) | 图像生成方法、装置、电子设备及存储介质 | |
CN112507705B (zh) | 一种位置编码的生成方法、装置及电子设备 | |
CN112948584A (zh) | 短文本分类方法、装置、设备以及存储介质 | |
US20230027813A1 (en) | Object detecting method, electronic device and storage medium | |
CN114863450B (zh) | 图像处理方法、装置、电子设备及存储介质 | |
US20220188163A1 (en) | Method for processing data, electronic device and storage medium | |
CN113343979B (zh) | 用于训练模型的方法、装置、设备、介质和程序产品 | |
CN116434000A (zh) | 模型训练及物品分类方法、装置、存储介质及电子设备 | |
CN115565186A (zh) | 文字识别模型的训练方法、装置、电子设备和存储介质 | |
CN113139463B (zh) | 用于训练模型的方法、装置、设备、介质和程序产品 | |
CN114973333A (zh) | 人物交互检测方法、装置、设备以及存储介质 | |
CN114707017A (zh) | 视觉问答方法、装置、电子设备和存储介质 |