JP7539201B2 - 階層クラスタリングを使用する希少トピック検出 - Google Patents

階層クラスタリングを使用する希少トピック検出 Download PDF

Info

Publication number
JP7539201B2
JP7539201B2 JP2022520298A JP2022520298A JP7539201B2 JP 7539201 B2 JP7539201 B2 JP 7539201B2 JP 2022520298 A JP2022520298 A JP 2022520298A JP 2022520298 A JP2022520298 A JP 2022520298A JP 7539201 B2 JP7539201 B2 JP 7539201B2
Authority
JP
Japan
Prior art keywords
cluster
clusters
hierarchical
topic model
hierarchical topic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2022520298A
Other languages
English (en)
Japanese (ja)
Other versions
JP2022552140A5 (https=
JP2022552140A (ja
Inventor
ガンチ、ラグー、キラン
スリバトサ、ムドハカル
スリランガムスリドハラン、シュリーランジャニ
リム、ヨン-スプ
アグラワル、ダクシー
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of JP2022552140A publication Critical patent/JP2022552140A/ja
Publication of JP2022552140A5 publication Critical patent/JP2022552140A5/ja
Application granted granted Critical
Publication of JP7539201B2 publication Critical patent/JP7539201B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)
JP2022520298A 2019-10-08 2020-09-29 階層クラスタリングを使用する希少トピック検出 Active JP7539201B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/596,399 2019-10-08
US16/596,399 US12259919B2 (en) 2019-10-08 2019-10-08 Rare topic detection using hierarchical clustering
PCT/IB2020/059112 WO2021070005A1 (en) 2019-10-08 2020-09-29 Rare topic detection using hierarchical clustering

Publications (3)

Publication Number Publication Date
JP2022552140A JP2022552140A (ja) 2022-12-15
JP2022552140A5 JP2022552140A5 (https=) 2022-12-22
JP7539201B2 true JP7539201B2 (ja) 2024-08-23

Family

ID=75273583

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2022520298A Active JP7539201B2 (ja) 2019-10-08 2020-09-29 階層クラスタリングを使用する希少トピック検出

Country Status (7)

Country Link
US (1) US12259919B2 (https=)
JP (1) JP7539201B2 (https=)
KR (1) KR102862150B1 (https=)
CN (1) CN114424197B (https=)
AU (1) AU2020364386B2 (https=)
GB (1) GB2604276A (https=)
WO (1) WO2021070005A1 (https=)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12259919B2 (en) 2019-10-08 2025-03-25 International Business Machines Corporation Rare topic detection using hierarchical clustering
US11354345B2 (en) * 2020-06-22 2022-06-07 Jpmorgan Chase Bank, N.A. Clustering topics for data visualization
US20230050622A1 (en) * 2021-08-11 2023-02-16 Yanran Wei Evolution of topics in a messaging system
US11941038B2 (en) 2022-05-19 2024-03-26 International Business Machines Corporation Transparent and controllable topic modeling
US12505144B2 (en) 2022-09-21 2025-12-23 International Business Machines Corporation Caching of text analytics based on topic demand and memory constraints
WO2024173841A1 (en) * 2023-02-16 2024-08-22 Jpmorgan Chase Bank, N.A. Systems and methods for seeded neural topic modeling
US20240354375A1 (en) * 2023-04-21 2024-10-24 Gong.Io Ltd. Techniques for aggregating insights of textual data using hierarchical clustering
US12549499B2 (en) 2023-04-24 2026-02-10 Gong.Io Ltd. System and method for generating a chat response on sales deals using a large language model
CN119046457B (zh) * 2024-10-30 2025-03-21 杭州正义先铎网络科技有限公司 基于智能文本解析的自动化内容管理方法、系统及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030212679A1 (en) 2002-05-10 2003-11-13 Sunil Venkayala Multi-category support for apply output
US20080222140A1 (en) 2007-02-20 2008-09-11 Wright State University Comparative web search system and method
US20110270830A1 (en) 2010-04-30 2011-11-03 Palo Alto Research Center Incorporated System And Method For Providing Multi-Core And Multi-Level Topical Organization In Social Indexes
US20130212110A1 (en) 2012-02-09 2013-08-15 Zofia Stankiewicz System and Method for Association Extraction for Surf-Shopping
US20180032606A1 (en) 2016-07-26 2018-02-01 Qualtrics, Llc Recommending topic clusters for unstructured text documents

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3791879B2 (ja) 1999-07-19 2006-06-28 富士通株式会社 文書要約装置およびその方法
US7644102B2 (en) 2001-10-19 2010-01-05 Xerox Corporation Methods, systems, and articles of manufacture for soft hierarchical clustering of co-occurring objects
US7451395B2 (en) 2002-12-16 2008-11-11 Palo Alto Research Center Incorporated Systems and methods for interactive topic-based text summarization
US20070078889A1 (en) 2005-10-04 2007-04-05 Hoskinson Ronald A Method and system for automated knowledge extraction and organization
US7809704B2 (en) * 2006-06-15 2010-10-05 Microsoft Corporation Combining spectral and probabilistic clustering
US7783640B2 (en) * 2006-11-03 2010-08-24 Oracle International Corp. Document summarization
US20100153318A1 (en) * 2008-11-19 2010-06-17 Massachusetts Institute Of Technology Methods and systems for automatically summarizing semantic properties from documents with freeform textual annotations
US8645298B2 (en) 2010-10-26 2014-02-04 Microsoft Corporation Topic models
US9430563B2 (en) 2012-02-02 2016-08-30 Xerox Corporation Document processing employing probabilistic topic modeling of documents represented as text words transformed to a continuous space
CN103927176B (zh) 2014-04-18 2017-02-22 扬州大学 一种基于层次主题模型的程序特征树的生成方法
CN103970865B (zh) 2014-05-08 2017-04-19 清华大学 基于种子词的微博文本层次主题发现方法及系统
US9959364B2 (en) * 2014-05-22 2018-05-01 Oath Inc. Content recommendations
US20160034757A1 (en) 2014-07-31 2016-02-04 Chegg, Inc. Generating an Academic Topic Graph from Digital Documents
US11989662B2 (en) * 2014-10-10 2024-05-21 San Diego State University Research Foundation Methods and systems for base map and inference mapping
US9575952B2 (en) 2014-10-21 2017-02-21 At&T Intellectual Property I, L.P. Unsupervised topic modeling for short texts
US9697245B1 (en) * 2015-12-30 2017-07-04 International Business Machines Corporation Data-dependent clustering of geospatial words
US10275444B2 (en) * 2016-07-15 2019-04-30 At&T Intellectual Property I, L.P. Data analytics system and methods for text data
US10997509B2 (en) * 2017-02-14 2021-05-04 Cognitive Scale, Inc. Hierarchical topic machine learning operation
CN108808322A (zh) 2017-05-04 2018-11-13 富士康(昆山)电脑接插件有限公司 电连接器
CN109544632B (zh) 2018-11-05 2021-08-03 浙江工业大学 一种基于层次主题模型的语义slam对象关联方法
US12259919B2 (en) 2019-10-08 2025-03-25 International Business Machines Corporation Rare topic detection using hierarchical clustering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030212679A1 (en) 2002-05-10 2003-11-13 Sunil Venkayala Multi-category support for apply output
US20080222140A1 (en) 2007-02-20 2008-09-11 Wright State University Comparative web search system and method
US20110270830A1 (en) 2010-04-30 2011-11-03 Palo Alto Research Center Incorporated System And Method For Providing Multi-Core And Multi-Level Topical Organization In Social Indexes
US20130212110A1 (en) 2012-02-09 2013-08-15 Zofia Stankiewicz System and Method for Association Extraction for Surf-Shopping
US20180032606A1 (en) 2016-07-26 2018-02-01 Qualtrics, Llc Recommending topic clusters for unstructured text documents

Also Published As

Publication number Publication date
AU2020364386A1 (en) 2022-03-24
KR102862150B1 (ko) 2025-09-18
KR20220050915A (ko) 2022-04-25
CN114424197B (zh) 2025-05-13
WO2021070005A1 (en) 2021-04-15
US12259919B2 (en) 2025-03-25
AU2020364386B2 (en) 2024-01-04
GB2604276A (en) 2022-08-31
GB202206094D0 (en) 2022-06-08
CN114424197A (zh) 2022-04-29
US20210103608A1 (en) 2021-04-08
JP2022552140A (ja) 2022-12-15

Similar Documents

Publication Publication Date Title
JP7539201B2 (ja) 階層クラスタリングを使用する希少トピック検出
US11269965B2 (en) Extractive query-focused multi-document summarization
US10621284B2 (en) Training data update
US10956684B2 (en) Topic kernelization for real-time conversation data
US10191946B2 (en) Answering natural language table queries through semantic table representation
US10558756B2 (en) Unsupervised information extraction dictionary creation
JP7481074B2 (ja) コンテキスト・アウェア・データ・マイニング
US10558747B2 (en) Unsupervised information extraction dictionary creation
US11475211B1 (en) Elucidated natural language artifact recombination with contextual awareness
US12242796B2 (en) Permutation invariance for representing linearized tabular data
US20220067539A1 (en) Knowledge induction using corpus expansion
JP7595654B2 (ja) 自然言語表現変形の生成
US11989513B2 (en) Quantitative comment summarization
US20170116629A1 (en) System for searching existing customer experience information through cross-industries from text descriptions on a customer experience

Legal Events

Date Code Title Description
RD04 Notification of resignation of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7424

Effective date: 20220518

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20221209

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20230224

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20240416

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20240513

RD12 Notification of acceptance of power of sub attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7432

Effective date: 20240513

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20240617

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20240723

RD14 Notification of resignation of power of sub attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7434

Effective date: 20240724

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20240806

R150 Certificate of patent or registration of utility model

Ref document number: 7539201

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150