KR102862150B1 - 계층적 클러스터링을 사용한 희귀 토픽 탐지 - Google Patents

계층적 클러스터링을 사용한 희귀 토픽 탐지

Info

Publication number
KR102862150B1
KR102862150B1 KR1020227008090A KR20227008090A KR102862150B1 KR 102862150 B1 KR102862150 B1 KR 102862150B1 KR 1020227008090 A KR1020227008090 A KR 1020227008090A KR 20227008090 A KR20227008090 A KR 20227008090A KR 102862150 B1 KR102862150 B1 KR 102862150B1
Authority
KR
South Korea
Prior art keywords
cluster
hierarchical topic
clusters
topic model
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
KR1020227008090A
Other languages
English (en)
Korean (ko)
Other versions
KR20220050915A (ko
Inventor
라구 키란 간티
머드하카르 슈리밧사
슈레란자니 스리랑암스리다란
연섭 임
닥쉬 아그라왈
Original Assignee
인터내셔널 비지네스 머신즈 코포레이션
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 인터내셔널 비지네스 머신즈 코포레이션 filed Critical 인터내셔널 비지네스 머신즈 코포레이션
Publication of KR20220050915A publication Critical patent/KR20220050915A/ko
Application granted granted Critical
Publication of KR102862150B1 publication Critical patent/KR102862150B1/ko
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)
KR1020227008090A 2019-10-08 2020-09-29 계층적 클러스터링을 사용한 희귀 토픽 탐지 Active KR102862150B1 (ko)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/596,399 2019-10-08
US16/596,399 US12259919B2 (en) 2019-10-08 2019-10-08 Rare topic detection using hierarchical clustering
PCT/IB2020/059112 WO2021070005A1 (en) 2019-10-08 2020-09-29 Rare topic detection using hierarchical clustering

Publications (2)

Publication Number Publication Date
KR20220050915A KR20220050915A (ko) 2022-04-25
KR102862150B1 true KR102862150B1 (ko) 2025-09-18

Family

ID=75273583

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020227008090A Active KR102862150B1 (ko) 2019-10-08 2020-09-29 계층적 클러스터링을 사용한 희귀 토픽 탐지

Country Status (7)

Country Link
US (1) US12259919B2 (https=)
JP (1) JP7539201B2 (https=)
KR (1) KR102862150B1 (https=)
CN (1) CN114424197B (https=)
AU (1) AU2020364386B2 (https=)
GB (1) GB2604276A (https=)
WO (1) WO2021070005A1 (https=)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12259919B2 (en) 2019-10-08 2025-03-25 International Business Machines Corporation Rare topic detection using hierarchical clustering
US11354345B2 (en) * 2020-06-22 2022-06-07 Jpmorgan Chase Bank, N.A. Clustering topics for data visualization
US20230050622A1 (en) * 2021-08-11 2023-02-16 Yanran Wei Evolution of topics in a messaging system
US11941038B2 (en) 2022-05-19 2024-03-26 International Business Machines Corporation Transparent and controllable topic modeling
US12505144B2 (en) 2022-09-21 2025-12-23 International Business Machines Corporation Caching of text analytics based on topic demand and memory constraints
WO2024173841A1 (en) * 2023-02-16 2024-08-22 Jpmorgan Chase Bank, N.A. Systems and methods for seeded neural topic modeling
US20240354375A1 (en) * 2023-04-21 2024-10-24 Gong.Io Ltd. Techniques for aggregating insights of textual data using hierarchical clustering
US12549499B2 (en) 2023-04-24 2026-02-10 Gong.Io Ltd. System and method for generating a chat response on sales deals using a large language model
CN119046457B (zh) * 2024-10-30 2025-03-21 杭州正义先铎网络科技有限公司 基于智能文本解析的自动化内容管理方法、系统及介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110270830A1 (en) * 2010-04-30 2011-11-03 Palo Alto Research Center Incorporated System And Method For Providing Multi-Core And Multi-Level Topical Organization In Social Indexes
CN103970865A (zh) * 2014-05-08 2014-08-06 清华大学 基于种子词的微博文本层次主题发现方法及系统

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3791879B2 (ja) 1999-07-19 2006-06-28 富士通株式会社 文書要約装置およびその方法
US7644102B2 (en) 2001-10-19 2010-01-05 Xerox Corporation Methods, systems, and articles of manufacture for soft hierarchical clustering of co-occurring objects
US7882127B2 (en) * 2002-05-10 2011-02-01 Oracle International Corporation Multi-category support for apply output
US7451395B2 (en) 2002-12-16 2008-11-11 Palo Alto Research Center Incorporated Systems and methods for interactive topic-based text summarization
US20070078889A1 (en) 2005-10-04 2007-04-05 Hoskinson Ronald A Method and system for automated knowledge extraction and organization
US7809704B2 (en) * 2006-06-15 2010-10-05 Microsoft Corporation Combining spectral and probabilistic clustering
US7783640B2 (en) * 2006-11-03 2010-08-24 Oracle International Corp. Document summarization
US7912847B2 (en) * 2007-02-20 2011-03-22 Wright State University Comparative web search system and method
US20100153318A1 (en) * 2008-11-19 2010-06-17 Massachusetts Institute Of Technology Methods and systems for automatically summarizing semantic properties from documents with freeform textual annotations
US8645298B2 (en) 2010-10-26 2014-02-04 Microsoft Corporation Topic models
US9430563B2 (en) 2012-02-02 2016-08-30 Xerox Corporation Document processing employing probabilistic topic modeling of documents represented as text words transformed to a continuous space
US8843497B2 (en) * 2012-02-09 2014-09-23 Linkshare Corporation System and method for association extraction for surf-shopping
CN103927176B (zh) 2014-04-18 2017-02-22 扬州大学 一种基于层次主题模型的程序特征树的生成方法
US9959364B2 (en) * 2014-05-22 2018-05-01 Oath Inc. Content recommendations
US20160034757A1 (en) 2014-07-31 2016-02-04 Chegg, Inc. Generating an Academic Topic Graph from Digital Documents
US11989662B2 (en) * 2014-10-10 2024-05-21 San Diego State University Research Foundation Methods and systems for base map and inference mapping
US9575952B2 (en) 2014-10-21 2017-02-21 At&T Intellectual Property I, L.P. Unsupervised topic modeling for short texts
US9697245B1 (en) * 2015-12-30 2017-07-04 International Business Machines Corporation Data-dependent clustering of geospatial words
US10275444B2 (en) * 2016-07-15 2019-04-30 At&T Intellectual Property I, L.P. Data analytics system and methods for text data
US11645317B2 (en) * 2016-07-26 2023-05-09 Qualtrics, Llc Recommending topic clusters for unstructured text documents
US10997509B2 (en) * 2017-02-14 2021-05-04 Cognitive Scale, Inc. Hierarchical topic machine learning operation
CN108808322A (zh) 2017-05-04 2018-11-13 富士康(昆山)电脑接插件有限公司 电连接器
CN109544632B (zh) 2018-11-05 2021-08-03 浙江工业大学 一种基于层次主题模型的语义slam对象关联方法
US12259919B2 (en) 2019-10-08 2025-03-25 International Business Machines Corporation Rare topic detection using hierarchical clustering

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110270830A1 (en) * 2010-04-30 2011-11-03 Palo Alto Research Center Incorporated System And Method For Providing Multi-Core And Multi-Level Topical Organization In Social Indexes
CN103970865A (zh) * 2014-05-08 2014-08-06 清华大学 基于种子词的微博文本层次主题发现方法及系统

Also Published As

Publication number Publication date
AU2020364386A1 (en) 2022-03-24
KR20220050915A (ko) 2022-04-25
CN114424197B (zh) 2025-05-13
WO2021070005A1 (en) 2021-04-15
US12259919B2 (en) 2025-03-25
AU2020364386B2 (en) 2024-01-04
GB2604276A (en) 2022-08-31
GB202206094D0 (en) 2022-06-08
CN114424197A (zh) 2022-04-29
JP7539201B2 (ja) 2024-08-23
US20210103608A1 (en) 2021-04-08
JP2022552140A (ja) 2022-12-15

Similar Documents

Publication Publication Date Title
KR102862150B1 (ko) 계층적 클러스터링을 사용한 희귀 토픽 탐지
US11093707B2 (en) Adversarial training data augmentation data for text classifiers
US11269965B2 (en) Extractive query-focused multi-document summarization
US11189269B2 (en) Adversarial training data augmentation for generating related responses
US10621284B2 (en) Training data update
US11182557B2 (en) Driving intent expansion via anomaly detection in a modular conversational system
US10956684B2 (en) Topic kernelization for real-time conversation data
US10929383B2 (en) Method and system for improving training data understanding in natural language processing
US11645513B2 (en) Unary relation extraction using distant supervision
US12566983B2 (en) Machine learning classifiers prediction confidence and explanation
US12093645B2 (en) Inter-training of pre-trained transformer-based language models using partitioning and classification
US11481442B2 (en) Leveraging intent resolvers to determine multiple intents
US11227127B2 (en) Natural language artificial intelligence topology mapping for chatbot communication flow
US20230092274A1 (en) Training example generation to create new intents for chatbots
US11803374B2 (en) Monolithic computer application refactoring
US20230186107A1 (en) Boosting classification and regression tree performance with dimension reduction
US20230161948A1 (en) Iteratively updating a document structure to resolve disconnected text in element blocks
US11449677B2 (en) Cognitive hierarchical content distribution
US12596923B2 (en) Machine learning of keywords
US11270075B2 (en) Generation of natural language expression variants
US12619884B2 (en) Artificial intelligence operations adaptive multi-granularity event grouping
US20230222358A1 (en) Artificial intelligence operations adaptive multi-granularity event grouping

Legal Events

Date Code Title Description
PA0105 International application

St.27 status event code: A-0-1-A10-A15-nap-PA0105

A201 Request for examination
PA0201 Request for examination

St.27 status event code: A-1-2-D10-D11-exm-PA0201

PG1501 Laying open of application

St.27 status event code: A-1-1-Q10-Q12-nap-PG1501

P22-X000 Classification modified

St.27 status event code: A-2-2-P10-P22-nap-X000

E902 Notification of reason for refusal
PE0902 Notice of grounds for rejection

St.27 status event code: A-1-2-D10-D21-exm-PE0902

P11-X000 Amendment of application requested

St.27 status event code: A-2-2-P10-P11-nap-X000

P13-X000 Application amended

St.27 status event code: A-2-2-P10-P13-nap-X000

D22 Grant of ip right intended

Free format text: ST27 STATUS EVENT CODE: A-1-2-D10-D22-EXM-PE0701 (AS PROVIDED BY THE NATIONAL OFFICE)

PE0701 Decision of registration

St.27 status event code: A-1-2-D10-D22-exm-PE0701

F11 Ip right granted following substantive examination

Free format text: ST27 STATUS EVENT CODE: A-2-4-F10-F11-EXM-PR0701 (AS PROVIDED BY THE NATIONAL OFFICE)

PR0701 Registration of establishment

St.27 status event code: A-2-4-F10-F11-exm-PR0701

PR1002 Payment of registration fee

St.27 status event code: A-2-2-U10-U12-oth-PR1002

Fee payment year number: 1

U12 Designation fee paid

Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U12-OTH-PR1002 (AS PROVIDED BY THE NATIONAL OFFICE)

Year of fee payment: 1

PG1601 Publication of registration

St.27 status event code: A-4-4-Q10-Q13-nap-PG1601

Q13 Ip right document published

Free format text: ST27 STATUS EVENT CODE: A-4-4-Q10-Q13-NAP-PG1601 (AS PROVIDED BY THE NATIONAL OFFICE)