TWI452475B - A dictionary generating device, a dictionary generating method, a dictionary generating program product, and a computer readable memory medium storing the program - Google Patents

A dictionary generating device, a dictionary generating method, a dictionary generating program product, and a computer readable memory medium storing the program Download PDF

Info

Publication number
TWI452475B
TWI452475B TW101133547A TW101133547A TWI452475B TW I452475 B TWI452475 B TW I452475B TW 101133547 A TW101133547 A TW 101133547A TW 101133547 A TW101133547 A TW 101133547A TW I452475 B TWI452475 B TW I452475B
Authority
TW
Taiwan
Prior art keywords
word
dictionary
unit
information
text
Prior art date
Application number
TW101133547A
Other languages
English (en)
Chinese (zh)
Other versions
TW201335776A (zh
Inventor
Masato Hagiwara
Original Assignee
Rakuten Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rakuten Inc filed Critical Rakuten Inc
Publication of TW201335776A publication Critical patent/TW201335776A/zh
Application granted granted Critical
Publication of TWI452475B publication Critical patent/TWI452475B/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
TW101133547A 2012-02-28 2012-09-13 A dictionary generating device, a dictionary generating method, a dictionary generating program product, and a computer readable memory medium storing the program TWI452475B (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US201261604266P 2012-02-28 2012-02-28

Publications (2)

Publication Number Publication Date
TW201335776A TW201335776A (zh) 2013-09-01
TWI452475B true TWI452475B (zh) 2014-09-11

Family

ID=49081915

Family Applications (1)

Application Number Title Priority Date Filing Date
TW101133547A TWI452475B (zh) 2012-02-28 2012-09-13 A dictionary generating device, a dictionary generating method, a dictionary generating program product, and a computer readable memory medium storing the program

Country Status (5)

Country Link
JP (1) JP5373998B1 (ja)
KR (1) KR101379128B1 (ja)
CN (1) CN103608805B (ja)
TW (1) TWI452475B (ja)
WO (1) WO2013128684A1 (ja)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701133B (zh) * 2014-11-28 2021-03-30 方正国际软件(北京)有限公司 一种地址输入的方法和设备
JP6813776B2 (ja) * 2016-10-27 2021-01-13 キヤノンマーケティングジャパン株式会社 情報処理装置、その制御方法及びプログラム
JP6707483B2 (ja) * 2017-03-09 2020-06-10 株式会社東芝 情報処理装置、情報処理方法、および情報処理プログラム
EP3446241A4 (en) 2017-06-20 2019-11-06 Accenture Global Solutions Limited AUTOMATIC EXTRACTION OF A LEARNING CORPUS FOR A DATA CLASSIFIER BASED ON AUTOMATIC LEARNING ALGORITHMS
JP2019049873A (ja) * 2017-09-11 2019-03-28 株式会社Screenホールディングス 同義語辞書作成装置、同義語辞書作成プログラム及び同義語辞書作成方法
CN109033183B (zh) * 2018-06-27 2021-06-25 清远墨墨教育科技有限公司 一种可编辑的云词库的解析方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09288673A (ja) * 1996-04-23 1997-11-04 Nippon Telegr & Teleph Corp <Ntt> 日本語形態素解析方法と装置及び辞書未登録語収集方法と装置
JP2002351870A (ja) * 2001-05-29 2002-12-06 Communication Research Laboratory 形態素の解析方法
TW200729001A (en) * 2005-01-31 2007-08-01 Nec China Co Ltd Dictionary learning method and device using the same, input method and user terminal device using the same
JP2008257511A (ja) * 2007-04-05 2008-10-23 Yahoo Japan Corp 専門用語抽出装置、方法及びプログラム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1086821C (zh) * 1998-08-13 2002-06-26 英业达股份有限公司 汉语语句切分的方法及其系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09288673A (ja) * 1996-04-23 1997-11-04 Nippon Telegr & Teleph Corp <Ntt> 日本語形態素解析方法と装置及び辞書未登録語収集方法と装置
JP2002351870A (ja) * 2001-05-29 2002-12-06 Communication Research Laboratory 形態素の解析方法
TW200729001A (en) * 2005-01-31 2007-08-01 Nec China Co Ltd Dictionary learning method and device using the same, input method and user terminal device using the same
JP2008257511A (ja) * 2007-04-05 2008-10-23 Yahoo Japan Corp 専門用語抽出装置、方法及びプログラム

Also Published As

Publication number Publication date
JP5373998B1 (ja) 2013-12-18
WO2013128684A1 (ja) 2013-09-06
KR101379128B1 (ko) 2014-03-27
CN103608805A (zh) 2014-02-26
KR20130137048A (ko) 2013-12-13
CN103608805B (zh) 2016-09-07
JPWO2013128684A1 (ja) 2015-07-30
TW201335776A (zh) 2013-09-01

Similar Documents

Publication Publication Date Title
KR102431549B1 (ko) 인과 관계 인식 장치 및 그것을 위한 컴퓨터 프로그램
CN108363790B (zh) 用于对评论进行评估的方法、装置、设备和存储介质
CN108287858B (zh) 自然语言的语义提取方法及装置
CN111444320B (zh) 文本检索方法、装置、计算机设备和存储介质
CN109416705B (zh) 利用语料库中可用的信息用于数据解析和预测
TWI452475B (zh) A dictionary generating device, a dictionary generating method, a dictionary generating program product, and a computer readable memory medium storing the program
US20160189057A1 (en) Computer implemented system and method for categorizing data
CN110851590A (zh) 一种通过敏感词检测与非法内容识别进行文本分类的方法
CN109614620B (zh) 一种基于HowNet的图模型词义消歧方法和系统
JP5809381B1 (ja) 自然言語処理システム、自然言語処理方法、および自然言語処理プログラム
CN101308512B (zh) 一种基于网页的互译翻译对抽取方法及装置
US20200311345A1 (en) System and method for language-independent contextual embedding
JP6186198B2 (ja) 学習モデル作成装置、翻訳装置、学習モデル作成方法、及びプログラム
CN113076748A (zh) 弹幕敏感词的处理方法、装置、设备及存储介质
US20140358522A1 (en) Information search apparatus and information search method
JP2011238159A (ja) 計算機システム
US20150019382A1 (en) Corpus creation device, corpus creation method and corpus creation program
JP5169456B2 (ja) 文書検索システム、文書検索方法および文書検索プログラム
JP6689466B1 (ja) 文構造ベクトル化装置、文構造ベクトル化方法、及び文構造ベクトル化プログラム
CN113420127A (zh) 威胁情报处理方法、装置、计算设备及存储介质
CN115495636A (zh) 网页搜索方法、装置及存储介质
JP4088171B2 (ja) テキスト解析装置、方法、プログラム及びそのプログラムを記録した記録媒体
CN112364666A (zh) 文本表征方法、装置及计算机设备
JP5506482B2 (ja) 固有表現抽出装置、文字列−固有表現クラス対データベース作成装置、固有表現抽出方法、文字列−固有表現クラス対データベース作成方法、プログラム
Chaonithi et al. A hybrid approach for Thai word segmentation with crowdsourcing feedback system