CN106066870A - 一种语境标注的双语平行语料库构建系统 - Google Patents
一种语境标注的双语平行语料库构建系统 Download PDFInfo
- Publication number
- CN106066870A CN106066870A CN201610368937.3A CN201610368937A CN106066870A CN 106066870 A CN106066870 A CN 106066870A CN 201610368937 A CN201610368937 A CN 201610368937A CN 106066870 A CN106066870 A CN 106066870A
- Authority
- CN
- China
- Prior art keywords
- linguistic context
- word
- language material
- mark
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000000463 material Substances 0.000 claims abstract description 60
- 238000000034 method Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 19
- 238000004458 analytical method Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 3
- 238000011160 research Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 239000004615 ingredient Substances 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
Abstract
Description
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610368937.3A CN106066870B (zh) | 2016-05-27 | 2016-05-27 | 一种语境标注的双语平行语料库构建系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610368937.3A CN106066870B (zh) | 2016-05-27 | 2016-05-27 | 一种语境标注的双语平行语料库构建系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106066870A true CN106066870A (zh) | 2016-11-02 |
CN106066870B CN106066870B (zh) | 2019-03-15 |
Family
ID=57421012
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610368937.3A Active CN106066870B (zh) | 2016-05-27 | 2016-05-27 | 一种语境标注的双语平行语料库构建系统 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106066870B (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109683773A (zh) * | 2017-10-19 | 2019-04-26 | 北京国双科技有限公司 | 语料标注方法和装置 |
CN110046261A (zh) * | 2019-04-22 | 2019-07-23 | 山东建筑大学 | 一种建筑工程多模态双语平行语料库的构建方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908042A (zh) * | 2010-08-09 | 2010-12-08 | 中国科学院自动化研究所 | 一种双语联合语义角色的标注方法 |
CN102591862A (zh) * | 2011-01-05 | 2012-07-18 | 华东师范大学 | 一种基于词共现的汉语实体关系提取的控制方法及装置 |
US20150019951A1 (en) * | 2012-01-05 | 2015-01-15 | Tencent Technology (Shenzhen) Company Limited | Method, apparatus, and computer storage medium for automatically adding tags to document |
CN104699766A (zh) * | 2015-02-15 | 2015-06-10 | 浙江理工大学 | 一种融合词语关联关系和上下文语境推断的隐式属性挖掘方法 |
-
2016
- 2016-05-27 CN CN201610368937.3A patent/CN106066870B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101908042A (zh) * | 2010-08-09 | 2010-12-08 | 中国科学院自动化研究所 | 一种双语联合语义角色的标注方法 |
CN102591862A (zh) * | 2011-01-05 | 2012-07-18 | 华东师范大学 | 一种基于词共现的汉语实体关系提取的控制方法及装置 |
US20150019951A1 (en) * | 2012-01-05 | 2015-01-15 | Tencent Technology (Shenzhen) Company Limited | Method, apparatus, and computer storage medium for automatically adding tags to document |
CN104699766A (zh) * | 2015-02-15 | 2015-06-10 | 浙江理工大学 | 一种融合词语关联关系和上下文语境推断的隐式属性挖掘方法 |
Non-Patent Citations (2)
Title |
---|
程兴国等: "词类共现概率的MapReduce并行生成方法", 《重庆理工大学学报(自然科学)》 * |
袁新华: "基于语料库的英语词汇搭配的共现形式及计算方法", 《科技信息》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109683773A (zh) * | 2017-10-19 | 2019-04-26 | 北京国双科技有限公司 | 语料标注方法和装置 |
CN109683773B (zh) * | 2017-10-19 | 2021-01-22 | 北京国双科技有限公司 | 语料标注方法和装置 |
CN110046261A (zh) * | 2019-04-22 | 2019-07-23 | 山东建筑大学 | 一种建筑工程多模态双语平行语料库的构建方法 |
CN110046261B (zh) * | 2019-04-22 | 2022-01-21 | 山东建筑大学 | 一种建筑工程多模态双语平行语料库的构建方法 |
Also Published As
Publication number | Publication date |
---|---|
CN106066870B (zh) | 2019-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105718586B (zh) | 分词的方法及装置 | |
CN105468900B (zh) | 一种基于知识库的智能病历录入平台 | |
CN106776562A (zh) | 一种关键词提取方法和提取系统 | |
CN111460787A (zh) | 一种话题提取方法、装置、终端设备及存储介质 | |
CN101937430B (zh) | 一种汉语句子中事件句式的抽取方法 | |
CN104504001B (zh) | 面向海量分布式关系数据库的游标构造方法 | |
CN1661593B (zh) | 一种计算机语言翻译方法及其翻译系统 | |
CN105608218A (zh) | 智能问答知识库的建立方法、建立装置及建立系统 | |
CN103116578A (zh) | 一种融合句法树和统计机器翻译技术的翻译方法与装置 | |
CN102214166A (zh) | 基于句法分析和层次模型的机器翻译系统和方法 | |
CN104361127A (zh) | 基于领域本体和模板逻辑的多语种问答接口快速构成方法 | |
JP2022522020A (ja) | 意味画像検索 | |
CN110263154A (zh) | 一种网络舆情情感态势量化方法、系统及存储介质 | |
CN101751430A (zh) | 电子词典模糊检索方法 | |
CN101894160B (zh) | 一种智能检索方法 | |
CN105630770A (zh) | 一种基于sc文法的分词标音连写方法及装置 | |
CN109670190A (zh) | 翻译模型构建方法和装置 | |
CN101739395A (zh) | 机器翻译方法和系统 | |
Shiwen et al. | Rule-based machine translation | |
CN107291858A (zh) | 一种基于字符串后缀的数据索引方法 | |
CN101464856A (zh) | 平行口语语料的对齐方法和装置 | |
CN110390099B (zh) | 一种基于模板库的对象关系抽取系统和抽取方法 | |
CN104317882A (zh) | 一种决策级中文分词融合方法 | |
CN103927179A (zh) | 一种基于WordNet的程序可读性分析方法 | |
CN106066870A (zh) | 一种语境标注的双语平行语料库构建系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder | ||
CP02 | Change in the address of a patent holder |
Address after: No.219, ningliu Road, Jiangbei new district, Nanjing, Jiangsu Province, 210000 Patentee after: Nanjing University of Information Science and Technology Address before: 210000 69 Olympic Sports street, Jianye District, Nanjing, Jiangsu. Patentee before: Nanjing University of Information Science and Technology |
|
TR01 | Transfer of patent right |
Effective date of registration: 20211124 Address after: Room 502, building 1, No. a, Beibinhe Road, Guang'anmenwai, Xicheng District, Beijing 100032 Patentee after: Jiaguyi (Beijing) Language Technology Co.,Ltd. Address before: No.219, ningliu Road, Jiangbei new district, Nanjing, Jiangsu Province, 210000 Patentee before: Nanjing University of Information Science and Technology |
|
TR01 | Transfer of patent right | ||
CP02 | Change in the address of a patent holder |
Address after: 101399 12-113, No. 2, CAIDA Second Street, Nancai Town, Shunyi District, Beijing Patentee after: Jiaguyi (Beijing) Language Technology Co.,Ltd. Address before: Room 502, building 1, No. a, Beibinhe Road, Guang'anmenwai, Xicheng District, Beijing 100032 Patentee before: Jiaguyi (Beijing) Language Technology Co.,Ltd. |
|
CP02 | Change in the address of a patent holder | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A Bilingual Parallel Corpus Construction System Based on Context Annotation Effective date of registration: 20230921 Granted publication date: 20190315 Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee Pledgor: Jiaguyi (Beijing) Language Technology Co.,Ltd. Registration number: Y2023990000471 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |