CN113255330B - 一种基于字符特征分类器与软输出的中文拼写检查方法 - Google Patents
一种基于字符特征分类器与软输出的中文拼写检查方法 Download PDFInfo
- Publication number
- CN113255330B CN113255330B CN202110599111.9A CN202110599111A CN113255330B CN 113255330 B CN113255330 B CN 113255330B CN 202110599111 A CN202110599111 A CN 202110599111A CN 113255330 B CN113255330 B CN 113255330B
- Authority
- CN
- China
- Prior art keywords
- character
- characters
- probability
- soft output
- pronunciation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims abstract description 69
- 239000011159 matrix material Substances 0.000 claims abstract description 37
- 238000001514 detection method Methods 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 7
- 239000000126 substance Substances 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims 1
- 230000007547 defect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
Description
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110599111.9A CN113255330B (zh) | 2021-05-31 | 2021-05-31 | 一种基于字符特征分类器与软输出的中文拼写检查方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110599111.9A CN113255330B (zh) | 2021-05-31 | 2021-05-31 | 一种基于字符特征分类器与软输出的中文拼写检查方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113255330A CN113255330A (zh) | 2021-08-13 |
CN113255330B true CN113255330B (zh) | 2021-09-24 |
Family
ID=77183823
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110599111.9A Active CN113255330B (zh) | 2021-05-31 | 2021-05-31 | 一种基于字符特征分类器与软输出的中文拼写检查方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113255330B (zh) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626049A (zh) * | 2020-05-27 | 2020-09-04 | 腾讯科技(深圳)有限公司 | 多媒体信息的标题修正方法、装置、电子设备及存储介质 |
CN112597753A (zh) * | 2020-12-22 | 2021-04-02 | 北京百度网讯科技有限公司 | 文本纠错处理方法、装置、电子设备和存储介质 |
CN112784582A (zh) * | 2021-02-09 | 2021-05-11 | 中国工商银行股份有限公司 | 纠错方法、装置和计算设备 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8077983B2 (en) * | 2007-10-04 | 2011-12-13 | Zi Corporation Of Canada, Inc. | Systems and methods for character correction in communication devices |
-
2021
- 2021-05-31 CN CN202110599111.9A patent/CN113255330B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626049A (zh) * | 2020-05-27 | 2020-09-04 | 腾讯科技(深圳)有限公司 | 多媒体信息的标题修正方法、装置、电子设备及存储介质 |
CN112597753A (zh) * | 2020-12-22 | 2021-04-02 | 北京百度网讯科技有限公司 | 文本纠错处理方法、装置、电子设备和存储介质 |
CN112784582A (zh) * | 2021-02-09 | 2021-05-11 | 中国工商银行股份有限公司 | 纠错方法、装置和计算设备 |
Also Published As
Publication number | Publication date |
---|---|
CN113255330A (zh) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110717031B (zh) | 一种智能会议纪要生成方法和系统 | |
CN109918666B (zh) | 一种基于神经网络的中文标点符号添加方法 | |
CN111209401A (zh) | 网络舆情文本信息情感极性分类处理系统及方法 | |
CN105404621B (zh) | 一种用于盲人读取汉字的方法及系统 | |
CN110232439B (zh) | 一种基于深度学习网络的意图识别方法 | |
CN111709242B (zh) | 一种基于命名实体识别的中文标点符号添加方法 | |
CN106847288A (zh) | 语音识别文本的纠错方法与装置 | |
US20070219777A1 (en) | Identifying language origin of words | |
CN112199945A (zh) | 一种文本纠错的方法和装置 | |
CN103035241A (zh) | 模型互补的汉语韵律间断识别系统及方法 | |
CN112990353B (zh) | 一种基于多模态模型的汉字易混淆集构建方法 | |
CN109992775A (zh) | 一种基于高级语义的文本摘要生成方法 | |
CN112905736B (zh) | 一种基于量子理论的无监督文本情感分析方法 | |
CN110276069A (zh) | 一种中国盲文错误自动检测方法、系统及存储介质 | |
CN113268576B (zh) | 一种基于深度学习的部门语义信息抽取的方法及装置 | |
CN110222338B (zh) | 一种机构名实体识别方法 | |
CN112818698A (zh) | 一种基于双通道模型的细粒度的用户评论情感分析方法 | |
CN114153971A (zh) | 一种含错中文文本纠错识别分类设备 | |
CN111339772B (zh) | 俄语文本情感分析方法、电子设备和存储介质 | |
CN114153973A (zh) | 基于t-m bert预训练模型的蒙古语多模态情感分析方法 | |
CN114757184B (zh) | 实现航空领域知识问答的方法和系统 | |
CN115759119A (zh) | 一种金融文本情感分析方法、系统、介质和设备 | |
CN115238693A (zh) | 一种基于多分词和多层双向长短期记忆的中文命名实体识别方法 | |
CN111241820A (zh) | 不良用语识别方法、装置、电子装置及存储介质 | |
CN113255330B (zh) | 一种基于字符特征分类器与软输出的中文拼写检查方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220714 Address after: Room 301ab, No. 10, Lane 198, zhangheng Road, China (Shanghai) pilot Free Trade Zone, Shanghai, 201203 Patentee after: SHANGHAI MDATA INFORMATION TECHNOLOGY Co.,Ltd. Address before: Yuelu District City, Hunan province 410000 Changsha Lushan Road No. 932 Patentee before: CENTRAL SOUTH University |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A Chinese spelling check method based on character feature classifier and soft output Effective date of registration: 20230215 Granted publication date: 20210924 Pledgee: Shanghai Rural Commercial Bank Co.,Ltd. Pudong branch Pledgor: SHANGHAI MDATA INFORMATION TECHNOLOGY Co.,Ltd. Registration number: Y2023310000031 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
CP03 | Change of name, title or address |
Address after: Room 301ab, No.10, Lane 198, zhangheng Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai 201204 Patentee after: Shanghai Mido Technology Co.,Ltd. Address before: Room 301ab, No. 10, Lane 198, zhangheng Road, China (Shanghai) pilot Free Trade Zone, Shanghai, 201203 Patentee before: SHANGHAI MDATA INFORMATION TECHNOLOGY Co.,Ltd. |
|
CP03 | Change of name, title or address | ||
PC01 | Cancellation of the registration of the contract for pledge of patent right |
Granted publication date: 20210924 Pledgee: Shanghai Rural Commercial Bank Co.,Ltd. Pudong branch Pledgor: SHANGHAI MDATA INFORMATION TECHNOLOGY Co.,Ltd. Registration number: Y2023310000031 |
|
PC01 | Cancellation of the registration of the contract for pledge of patent right |