JP2010537286A5 - - Google Patents

Download PDF

Info

Publication number
JP2010537286A5
JP2010537286A5 JP2010521289A JP2010521289A JP2010537286A5 JP 2010537286 A5 JP2010537286 A5 JP 2010537286A5 JP 2010521289 A JP2010521289 A JP 2010521289A JP 2010521289 A JP2010521289 A JP 2010521289A JP 2010537286 A5 JP2010537286 A5 JP 2010537286A5
Authority
JP
Japan
Prior art keywords
word
topic
candidate
corpus
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2010521289A
Other languages
English (en)
Japanese (ja)
Other versions
JP2010537286A (ja
JP5379138B2 (ja
Filing date
Publication date
Priority claimed from US11/844,067 external-priority patent/US7983902B2/en
Priority claimed from US11/844,153 external-priority patent/US7917355B2/en
Application filed filed Critical
Priority claimed from PCT/CN2008/072128 external-priority patent/WO2009026850A1/en
Publication of JP2010537286A publication Critical patent/JP2010537286A/ja
Publication of JP2010537286A5 publication Critical patent/JP2010537286A5/ja
Application granted granted Critical
Publication of JP5379138B2 publication Critical patent/JP5379138B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

JP2010521289A 2007-08-23 2008-08-25 領域辞書の作成 Active JP5379138B2 (ja)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US11/844,067 US7983902B2 (en) 2007-08-23 2007-08-23 Domain dictionary creation by detection of new topic words using divergence value comparison
US11/844,153 2007-08-23
US11/844,153 US7917355B2 (en) 2007-08-23 2007-08-23 Word detection
US11/844,067 2007-08-23
PCT/CN2008/072128 WO2009026850A1 (en) 2007-08-23 2008-08-25 Domain dictionary creation

Publications (3)

Publication Number Publication Date
JP2010537286A JP2010537286A (ja) 2010-12-02
JP2010537286A5 true JP2010537286A5 (ru) 2011-10-13
JP5379138B2 JP5379138B2 (ja) 2013-12-25

Family

ID=40386710

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2010521289A Active JP5379138B2 (ja) 2007-08-23 2008-08-25 領域辞書の作成

Country Status (3)

Country Link
JP (1) JP5379138B2 (ru)
CN (1) CN101836205A (ru)
WO (1) WO2009026850A1 (ru)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011120211A1 (en) 2010-03-29 2011-10-06 Nokia Corporation Method and apparatus for seeded user interest modeling
CN102236639B (zh) * 2010-04-28 2016-08-10 三星电子株式会社 更新语言模型的系统和方法
CN102411563B (zh) * 2010-09-26 2015-06-17 阿里巴巴集团控股有限公司 一种识别目标词的方法、装置及系统
US9069798B2 (en) * 2012-05-24 2015-06-30 Mitsubishi Electric Research Laboratories, Inc. Method of text classification using discriminative topic transformation
CN104239285A (zh) * 2013-06-06 2014-12-24 腾讯科技(深圳)有限公司 文章新章节的检测方法及装置
CN108170294B (zh) * 2013-08-08 2021-04-16 阿里巴巴集团控股有限公司 词汇显示、字段转换方法及客户端、电子设备和计算机存储介质
CN103970730A (zh) * 2014-04-29 2014-08-06 河海大学 一种从单个中文文本中提取多主题词的方法
WO2016172288A1 (en) * 2015-04-21 2016-10-27 Lexisnexis, A Division Of Reed Elsevier Inc. Systems and methods for generating concepts from a document corpus
US20170229124A1 (en) * 2016-02-05 2017-08-10 Google Inc. Re-recognizing speech with external data sources
CN105956359B (zh) * 2016-04-15 2018-06-05 陈杰 一种用于异构系统的药品项目名称对照转译方法
CN106682128A (zh) * 2016-12-13 2017-05-17 成都数联铭品科技有限公司 多领域词典自动构建方法
CN107704102B (zh) * 2017-10-09 2021-08-03 北京新美互通科技有限公司 一种文本输入方法及装置
CN113780007A (zh) * 2021-10-22 2021-12-10 平安科技(深圳)有限公司 语料筛选方法、意图识别模型优化方法、设备及存储介质
CN115858787B (zh) * 2022-12-12 2023-08-01 交通运输部公路科学研究所 一种基于公路运输中问题诉求信息的热点提取和挖掘方法
CN116911321B (zh) * 2023-06-21 2024-05-14 三峡高科信息技术有限责任公司 一种前端自动翻译字典值的方法及组件

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2883153B2 (ja) * 1990-04-02 1999-04-19 株式会社リコー キーワード抽出装置
US6167368A (en) * 1998-08-14 2000-12-26 The Trustees Of Columbia University In The City Of New York Method and system for indentifying significant topics of a document
US6651058B1 (en) * 1999-11-15 2003-11-18 International Business Machines Corporation System and method of automatic discovery of terms in a document that are relevant to a given target topic
GB2399427A (en) * 2003-03-12 2004-09-15 Canon Kk Apparatus for and method of summarising text
JP4254623B2 (ja) * 2004-06-09 2009-04-15 日本電気株式会社 トピック分析方法及びその装置並びにプログラム
JP5259919B2 (ja) * 2005-07-21 2013-08-07 ダイキン工業株式会社 軸流ファン
US7813919B2 (en) * 2005-12-20 2010-10-12 Xerox Corporation Class description generation for clustering and categorization

Similar Documents

Publication Publication Date Title
JP2010537286A5 (ru)
CN109564575B (zh) 使用机器学习模型来对图像进行分类
CN110704621B (zh) 文本处理方法、装置及存储介质和电子设备
US9471644B2 (en) Method and system for scoring texts
CN107562824B (zh) 一种文本相似度检测方法
CN111079412A (zh) 文本纠错方法及装置
JP5379138B2 (ja) 領域辞書の作成
CN107608953B (zh) 一种基于不定长上下文的词向量生成方法
WO2012148950A2 (en) Representing information from documents
CN109492217B (zh) 一种基于机器学习的分词方法及终端设备
CN111859932B (zh) 一种文本摘要的生成方法、装置、电子设备及存储介质
CN104536979B (zh) 主题模型的生成方法及装置、主题分布的获取方法及装置
JP5809381B1 (ja) 自然言語処理システム、自然言語処理方法、および自然言語処理プログラム
CN109165529B (zh) 一种暗链篡改检测方法、装置和计算机可读存储介质
CN113887930B (zh) 问答机器人健康度评估方法、装置、设备及存储介质
CN109993216B (zh) 一种基于k最近邻knn的文本分类方法及其设备
CN113011164A (zh) 数据质量检测方法、装置、电子设备及介质
TWI465949B (zh) 資料分群裝置和方法
CN110287302B (zh) 一种国防科技领域开源信息置信度确定方法及系统
CN107622129B (zh) 一种知识库的组织方法及装置、计算机存储介质
CN108154382B (zh) 评价装置、评价方法及存储介质
CN110717029A (zh) 一种信息处理方法和系统
AU2021312671B2 (en) Value over replacement feature (VORF) based determination of feature importance in machine learning
CN111914536B (zh) 观点分析方法、装置、设备及存储介质
CN111125350B (zh) 基于双语平行语料生成lda主题模型的方法及装置