CN107229611A - 一种基于词对齐的历史典籍分词方法 - Google Patents
一种基于词对齐的历史典籍分词方法 Download PDFInfo
- Publication number
- CN107229611A CN107229611A CN201710351463.6A CN201710351463A CN107229611A CN 107229611 A CN107229611 A CN 107229611A CN 201710351463 A CN201710351463 A CN 201710351463A CN 107229611 A CN107229611 A CN 107229611A
- Authority
- CN
- China
- Prior art keywords
- word
- chinese
- alignment
- ancient
- records
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000011218 segmentation Effects 0.000 claims abstract description 15
- 238000013519 translation Methods 0.000 claims description 12
- 238000005267 amalgamation Methods 0.000 claims description 6
- 239000000463 material Substances 0.000 abstract description 8
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 238000012549 training Methods 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710351463.6A CN107229611B (zh) | 2017-05-18 | 2017-05-18 | 一种基于词对齐的历史典籍分词方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710351463.6A CN107229611B (zh) | 2017-05-18 | 2017-05-18 | 一种基于词对齐的历史典籍分词方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107229611A true CN107229611A (zh) | 2017-10-03 |
CN107229611B CN107229611B (zh) | 2020-06-30 |
Family
ID=59934537
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710351463.6A Expired - Fee Related CN107229611B (zh) | 2017-05-18 | 2017-05-18 | 一种基于词对齐的历史典籍分词方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107229611B (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109684648A (zh) * | 2019-01-14 | 2019-04-26 | 浙江大学 | 一种多特征融合的古今汉语自动翻译方法 |
CN109829159A (zh) * | 2019-01-29 | 2019-05-31 | 南京师范大学 | 一种古汉语文本的一体化自动词法分析方法及系统 |
CN116070643A (zh) * | 2023-04-03 | 2023-05-05 | 武昌理工学院 | 一种古文到英文的固定风格翻译方法及系统 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1335301A2 (en) * | 2002-02-07 | 2003-08-13 | Matsushita Electric Industrial Co., Ltd. | Context-aware linear time tokenizer |
CN1567297A (zh) * | 2003-07-03 | 2005-01-19 | 中国科学院声学研究所 | 一种从双语语料库中自动抽取多词翻译等价单元的方法 |
US20090089047A1 (en) * | 2007-08-31 | 2009-04-02 | Powerset, Inc. | Natural Language Hypernym Weighting For Word Sense Disambiguation |
CN102693222A (zh) * | 2012-05-25 | 2012-09-26 | 熊晶 | 基于实例的甲骨文释文机器翻译方法 |
CN105446962A (zh) * | 2015-12-30 | 2016-03-30 | 武汉传神信息技术有限公司 | 原文和译文的对齐方法和装置 |
CN106649289A (zh) * | 2016-12-16 | 2017-05-10 | 中国科学院自动化研究所 | 同时识别双语术语与词对齐的实现方法及实现系统 |
-
2017
- 2017-05-18 CN CN201710351463.6A patent/CN107229611B/zh not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1335301A2 (en) * | 2002-02-07 | 2003-08-13 | Matsushita Electric Industrial Co., Ltd. | Context-aware linear time tokenizer |
CN1567297A (zh) * | 2003-07-03 | 2005-01-19 | 中国科学院声学研究所 | 一种从双语语料库中自动抽取多词翻译等价单元的方法 |
US20090089047A1 (en) * | 2007-08-31 | 2009-04-02 | Powerset, Inc. | Natural Language Hypernym Weighting For Word Sense Disambiguation |
CN102693222A (zh) * | 2012-05-25 | 2012-09-26 | 熊晶 | 基于实例的甲骨文释文机器翻译方法 |
CN105446962A (zh) * | 2015-12-30 | 2016-03-30 | 武汉传神信息技术有限公司 | 原文和译文的对齐方法和装置 |
CN106649289A (zh) * | 2016-12-16 | 2017-05-10 | 中国科学院自动化研究所 | 同时识别双语术语与词对齐的实现方法及实现系统 |
Non-Patent Citations (1)
Title |
---|
李秀英: ""基于历史典籍双语平行语料库的术语对齐研究"", 《中国博士学位论文全文数据库》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109684648A (zh) * | 2019-01-14 | 2019-04-26 | 浙江大学 | 一种多特征融合的古今汉语自动翻译方法 |
CN109829159A (zh) * | 2019-01-29 | 2019-05-31 | 南京师范大学 | 一种古汉语文本的一体化自动词法分析方法及系统 |
CN109829159B (zh) * | 2019-01-29 | 2020-02-18 | 南京师范大学 | 一种古汉语文本的一体化自动词法分析方法及系统 |
CN116070643A (zh) * | 2023-04-03 | 2023-05-05 | 武昌理工学院 | 一种古文到英文的固定风格翻译方法及系统 |
CN116070643B (zh) * | 2023-04-03 | 2023-08-15 | 武昌理工学院 | 一种古文到英文的固定风格翻译方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN107229611B (zh) | 2020-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102033879B (zh) | 一种中文人名识别的方法和装置 | |
US9069753B2 (en) | Determining proximity measurements indicating respective intended inputs | |
US8670975B2 (en) | Adaptive pattern learning for bilingual data mining | |
CN108268668B (zh) | 一种基于话题多样性的文本数据观点摘要挖掘方法 | |
CN107729321A (zh) | 一种语音识别结果纠错方法 | |
CN106096664B (zh) | 一种基于社交网络数据的情感分析方法 | |
CN108874771A (zh) | 一种面向招标文本的信息抽取方法 | |
Huang et al. | Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization | |
CN105868176A (zh) | 基于文字的视频合成方法及其系统 | |
CN1910573A (zh) | 用来识别并分类命名实体的系统 | |
CN110276071A (zh) | 一种文本匹配方法、装置、计算机设备及存储介质 | |
CN110046351A (zh) | 规则驱动下基于特征的文本关系抽取方法 | |
CN107818082B (zh) | 结合短语结构树的语义角色识别方法 | |
CN107229611A (zh) | 一种基于词对齐的历史典籍分词方法 | |
Liu et al. | Phrasal substitution of idiomatic expressions | |
CN103049458A (zh) | 一种修正用户词库的方法和系统 | |
JP2020098594A (ja) | 情報処理方法、自然言語処理方法及び情報処理装置 | |
CN104050255A (zh) | 基于联合图模型的纠错方法及系统 | |
CN106156013A (zh) | 一种固定搭配型短语优先的两段式机器翻译方法 | |
Pinter et al. | Will it Unblend? | |
CN107861937B (zh) | 对译语料库的更新方法、更新装置以及记录介质 | |
WO2014189400A1 (en) | A method for diacritisation of texts written in latin- or cyrillic-derived alphabets | |
CN105975487B (zh) | 一种app软件用户评论有关性判断方法 | |
CN106484660A (zh) | 标题处理方法和装置 | |
CN109657244A (zh) | 一种英文长句自动切分方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CB03 | Change of inventor or designer information |
Inventor after: Che Chao Inventor after: Wu Xiaoting Inventor before: Che Chao Inventor before: Wu Xiaoting |
|
CB03 | Change of inventor or designer information | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230315 Address after: No. 17, Huixian Street, Qixianling, Lingshui Town, Ganjingzi District, Dalian City, Liaoning Province, 116024 Patentee after: DALIAN TONGDIAN TECHNOLOGY CO.,LTD. Address before: No.10 Xuefu street, Dalian Development Zone, Liaoning Province, 116622 Patentee before: DALIAN University |
|
TR01 | Transfer of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200630 |