KR20220122566A - 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 - Google Patents

텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 Download PDF

Info

Publication number
KR20220122566A
KR20220122566A KR1020220101802A KR20220101802A KR20220122566A KR 20220122566 A KR20220122566 A KR 20220122566A KR 1020220101802 A KR1020220101802 A KR 1020220101802A KR 20220101802 A KR20220101802 A KR 20220101802A KR 20220122566 A KR20220122566 A KR 20220122566A
Authority
KR
South Korea
Prior art keywords
image
text
training
target
recognition
Prior art date
Application number
KR1020220101802A
Other languages
English (en)
Korean (ko)
Inventor
청콴 장
위에천 위
위린 리
지안지안 차오
샤멍 친
쿤 야오
준위 한
징투오 리우
얼루이 딩
징동 왕
Original Assignee
베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드 filed Critical 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드
Publication of KR20220122566A publication Critical patent/KR20220122566A/ko

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06T5/004
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • G06T5/75Unsharp masking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)
KR1020220101802A 2022-03-22 2022-08-16 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치 KR20220122566A (ko)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210279539.XA CN114399769B (zh) 2022-03-22 2022-03-22 文本识别模型的训练方法、文本识别方法及装置
CN202210279539.X 2022-03-22

Publications (1)

Publication Number Publication Date
KR20220122566A true KR20220122566A (ko) 2022-09-02

Family

ID=81234744

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020220101802A KR20220122566A (ko) 2022-03-22 2022-08-16 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치

Country Status (3)

Country Link
JP (1) JP2022177242A (zh)
KR (1) KR20220122566A (zh)
CN (2) CN114399769B (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024063437A1 (ko) * 2022-09-22 2024-03-28 쿠팡 주식회사 인공지능 모델을 관리하기 위한 방법 및 장치
CN117786417A (zh) * 2024-02-28 2024-03-29 之江实验室 一种模型训练方法、瞬变源的识别方法、装置及电子设备
CN118366011A (zh) * 2024-06-19 2024-07-19 温州电力建设有限公司 模型训练、地下电缆管道缺陷识别方法、产品及设备

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114399769B (zh) * 2022-03-22 2022-08-02 北京百度网讯科技有限公司 文本识别模型的训练方法、文本识别方法及装置
CN114863450B (zh) * 2022-05-19 2023-05-16 北京百度网讯科技有限公司 图像处理方法、装置、电子设备及存储介质
CN114972910B (zh) * 2022-05-20 2023-05-23 北京百度网讯科技有限公司 图文识别模型的训练方法、装置、电子设备及存储介质
CN115661829A (zh) * 2022-10-26 2023-01-31 阿里巴巴(中国)有限公司 图文识别方法及图文识别模型的数据处理方法
WO2024108472A1 (zh) * 2022-11-24 2024-05-30 北京京东方技术开发有限公司 模型训练方法及装置、文本图像处理方法、设备、介质
CN116012650B (zh) * 2023-01-03 2024-04-23 北京百度网讯科技有限公司 文字识别模型训练及其识别方法、装置、设备和介质
CN116189198B (zh) * 2023-01-06 2024-06-28 北京百度网讯科技有限公司 文本识别模型训练方法、文本识别方法、装置及存储介质
CN116229480B (zh) * 2023-01-10 2024-05-28 北京百度网讯科技有限公司 文本识别模型训练方法、文本识别方法、装置及存储介质
CN116363663A (zh) * 2023-04-03 2023-06-30 北京百度网讯科技有限公司 图像处理方法、图像识别方法及装置
CN116884003B (zh) * 2023-07-18 2024-03-22 南京领行科技股份有限公司 图片自动标注方法、装置、电子设备及存储介质
CN117115306B (zh) * 2023-08-30 2024-07-12 苏州畅行智驾汽车科技有限公司 一种图像生成方法、装置、电子设备及存储介质

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549893B (zh) * 2018-04-04 2020-03-31 华中科技大学 一种任意形状的场景文本端到端识别方法
US11093560B2 (en) * 2018-09-21 2021-08-17 Microsoft Technology Licensing, Llc Stacked cross-modal matching
EP3754549B1 (en) * 2019-06-17 2021-09-22 Sap Se A computer vision method for recognizing an object category in a digital image
CN111461203A (zh) * 2020-03-30 2020-07-28 北京百度网讯科技有限公司 跨模态处理方法、装置、电子设备和计算机存储介质
CN112016543B (zh) * 2020-07-24 2024-09-20 华为技术有限公司 一种文本识别网络、神经网络训练的方法以及相关设备
CN113792113A (zh) * 2020-07-31 2021-12-14 北京京东尚科信息技术有限公司 视觉语言模型获得及任务处理方法、装置、设备及介质
CN113537186A (zh) * 2020-12-04 2021-10-22 腾讯科技(深圳)有限公司 文本图像的识别方法、装置、电子设备及存储介质
CN112541501B (zh) * 2020-12-18 2021-09-07 北京中科研究院 一种基于视觉语言建模网络的场景文字识别方法
CN112801085A (zh) * 2021-02-09 2021-05-14 沈阳麟龙科技股份有限公司 一种图像中文字的识别方法、装置、介质及电子设备
CN112883953B (zh) * 2021-02-22 2022-10-28 中国工商银行股份有限公司 基于联合学习的卡片识别装置及方法
CN113378833B (zh) * 2021-06-25 2023-09-01 北京百度网讯科技有限公司 图像识别模型训练方法、图像识别方法、装置及电子设备
CN113435529B (zh) * 2021-07-06 2023-11-07 北京百度网讯科技有限公司 模型预训练方法、模型训练方法及图像处理方法
CN113657390B (zh) * 2021-08-13 2022-08-12 北京百度网讯科技有限公司 文本检测模型的训练方法和检测文本方法、装置和设备
CN113657399B (zh) * 2021-08-18 2022-09-27 北京百度网讯科技有限公司 文字识别模型的训练方法、文字识别方法及装置
CN114120305B (zh) * 2021-11-26 2023-07-07 北京百度网讯科技有限公司 文本分类模型的训练方法、文本内容的识别方法及装置
CN114155543B (zh) * 2021-12-08 2022-11-29 北京百度网讯科技有限公司 神经网络训练方法、文档图像理解方法、装置和设备
CN114399769B (zh) * 2022-03-22 2022-08-02 北京百度网讯科技有限公司 文本识别模型的训练方法、文本识别方法及装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024063437A1 (ko) * 2022-09-22 2024-03-28 쿠팡 주식회사 인공지능 모델을 관리하기 위한 방법 및 장치
CN117786417A (zh) * 2024-02-28 2024-03-29 之江实验室 一种模型训练方法、瞬变源的识别方法、装置及电子设备
CN117786417B (zh) * 2024-02-28 2024-05-10 之江实验室 一种模型训练方法、瞬变源的识别方法、装置及电子设备
CN118366011A (zh) * 2024-06-19 2024-07-19 温州电力建设有限公司 模型训练、地下电缆管道缺陷识别方法、产品及设备

Also Published As

Publication number Publication date
JP2022177242A (ja) 2022-11-30
CN115035538B (zh) 2023-04-07
CN114399769B (zh) 2022-08-02
CN114399769A (zh) 2022-04-26
CN115035538A (zh) 2022-09-09

Similar Documents

Publication Publication Date Title
KR20220122566A (ko) 텍스트 인식 모델의 트레이닝 방법, 텍스트 인식 방법 및 장치
JP7406606B2 (ja) テキスト認識モデルの訓練方法、テキスト認識方法及び装置
US11610384B2 (en) Zero-shot object detection
US20240265718A1 (en) Method of training text detection model, method of detecting text, and device
EP3913542A2 (en) Method and apparatus of training model, device, medium, and program product
WO2022227769A1 (zh) 车道线检测模型的训练方法、装置、电子设备及存储介质
US20220415072A1 (en) Image processing method, text recognition method and apparatus
US20230009547A1 (en) Method and apparatus for detecting object based on video, electronic device and storage medium
CN111291882A (zh) 一种模型转换的方法、装置、设备和计算机存储介质
CN116152833B (zh) 基于图像的表格还原模型的训练方法及表格还原方法
CN114863437B (zh) 文本识别方法、装置、电子设备和存储介质
EP4191544A1 (en) Method and apparatus for recognizing token, electronic device and storage medium
US20230245429A1 (en) Method and apparatus for training lane line detection model, electronic device and storage medium
CN117830580A (zh) 图像生成方法、装置、电子设备及存储介质
CN112507705B (zh) 一种位置编码的生成方法、装置及电子设备
CN112948584A (zh) 短文本分类方法、装置、设备以及存储介质
US20230027813A1 (en) Object detecting method, electronic device and storage medium
CN114863450B (zh) 图像处理方法、装置、电子设备及存储介质
US20220188163A1 (en) Method for processing data, electronic device and storage medium
CN113343979B (zh) 用于训练模型的方法、装置、设备、介质和程序产品
CN116434000A (zh) 模型训练及物品分类方法、装置、存储介质及电子设备
CN115565186A (zh) 文字识别模型的训练方法、装置、电子设备和存储介质
CN113139463B (zh) 用于训练模型的方法、装置、设备、介质和程序产品
CN114973333A (zh) 人物交互检测方法、装置、设备以及存储介质
CN114707017A (zh) 视觉问答方法、装置、电子设备和存储介质