KR102862150B1 - 계층적 클러스터링을 사용한 희귀 토픽 탐지 - Google Patents
계층적 클러스터링을 사용한 희귀 토픽 탐지Info
- Publication number
- KR102862150B1 KR102862150B1 KR1020227008090A KR20227008090A KR102862150B1 KR 102862150 B1 KR102862150 B1 KR 102862150B1 KR 1020227008090 A KR1020227008090 A KR 1020227008090A KR 20227008090 A KR20227008090 A KR 20227008090A KR 102862150 B1 KR102862150 B1 KR 102862150B1
- Authority
- KR
- South Korea
- Prior art keywords
- cluster
- hierarchical topic
- clusters
- topic model
- words
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3347—Query execution using vector based model
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/596,399 | 2019-10-08 | ||
| US16/596,399 US12259919B2 (en) | 2019-10-08 | 2019-10-08 | Rare topic detection using hierarchical clustering |
| PCT/IB2020/059112 WO2021070005A1 (en) | 2019-10-08 | 2020-09-29 | Rare topic detection using hierarchical clustering |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| KR20220050915A KR20220050915A (ko) | 2022-04-25 |
| KR102862150B1 true KR102862150B1 (ko) | 2025-09-18 |
Family
ID=75273583
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| KR1020227008090A Active KR102862150B1 (ko) | 2019-10-08 | 2020-09-29 | 계층적 클러스터링을 사용한 희귀 토픽 탐지 |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US12259919B2 (https=) |
| JP (1) | JP7539201B2 (https=) |
| KR (1) | KR102862150B1 (https=) |
| CN (1) | CN114424197B (https=) |
| AU (1) | AU2020364386B2 (https=) |
| GB (1) | GB2604276A (https=) |
| WO (1) | WO2021070005A1 (https=) |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12259919B2 (en) | 2019-10-08 | 2025-03-25 | International Business Machines Corporation | Rare topic detection using hierarchical clustering |
| US11354345B2 (en) * | 2020-06-22 | 2022-06-07 | Jpmorgan Chase Bank, N.A. | Clustering topics for data visualization |
| US20230050622A1 (en) * | 2021-08-11 | 2023-02-16 | Yanran Wei | Evolution of topics in a messaging system |
| US11941038B2 (en) | 2022-05-19 | 2024-03-26 | International Business Machines Corporation | Transparent and controllable topic modeling |
| US12505144B2 (en) | 2022-09-21 | 2025-12-23 | International Business Machines Corporation | Caching of text analytics based on topic demand and memory constraints |
| WO2024173841A1 (en) * | 2023-02-16 | 2024-08-22 | Jpmorgan Chase Bank, N.A. | Systems and methods for seeded neural topic modeling |
| US20240354375A1 (en) * | 2023-04-21 | 2024-10-24 | Gong.Io Ltd. | Techniques for aggregating insights of textual data using hierarchical clustering |
| US12549499B2 (en) | 2023-04-24 | 2026-02-10 | Gong.Io Ltd. | System and method for generating a chat response on sales deals using a large language model |
| CN119046457B (zh) * | 2024-10-30 | 2025-03-21 | 杭州正义先铎网络科技有限公司 | 基于智能文本解析的自动化内容管理方法、系统及介质 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110270830A1 (en) * | 2010-04-30 | 2011-11-03 | Palo Alto Research Center Incorporated | System And Method For Providing Multi-Core And Multi-Level Topical Organization In Social Indexes |
| CN103970865A (zh) * | 2014-05-08 | 2014-08-06 | 清华大学 | 基于种子词的微博文本层次主题发现方法及系统 |
Family Cites Families (24)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3791879B2 (ja) | 1999-07-19 | 2006-06-28 | 富士通株式会社 | 文書要約装置およびその方法 |
| US7644102B2 (en) | 2001-10-19 | 2010-01-05 | Xerox Corporation | Methods, systems, and articles of manufacture for soft hierarchical clustering of co-occurring objects |
| US7882127B2 (en) * | 2002-05-10 | 2011-02-01 | Oracle International Corporation | Multi-category support for apply output |
| US7451395B2 (en) | 2002-12-16 | 2008-11-11 | Palo Alto Research Center Incorporated | Systems and methods for interactive topic-based text summarization |
| US20070078889A1 (en) | 2005-10-04 | 2007-04-05 | Hoskinson Ronald A | Method and system for automated knowledge extraction and organization |
| US7809704B2 (en) * | 2006-06-15 | 2010-10-05 | Microsoft Corporation | Combining spectral and probabilistic clustering |
| US7783640B2 (en) * | 2006-11-03 | 2010-08-24 | Oracle International Corp. | Document summarization |
| US7912847B2 (en) * | 2007-02-20 | 2011-03-22 | Wright State University | Comparative web search system and method |
| US20100153318A1 (en) * | 2008-11-19 | 2010-06-17 | Massachusetts Institute Of Technology | Methods and systems for automatically summarizing semantic properties from documents with freeform textual annotations |
| US8645298B2 (en) | 2010-10-26 | 2014-02-04 | Microsoft Corporation | Topic models |
| US9430563B2 (en) | 2012-02-02 | 2016-08-30 | Xerox Corporation | Document processing employing probabilistic topic modeling of documents represented as text words transformed to a continuous space |
| US8843497B2 (en) * | 2012-02-09 | 2014-09-23 | Linkshare Corporation | System and method for association extraction for surf-shopping |
| CN103927176B (zh) | 2014-04-18 | 2017-02-22 | 扬州大学 | 一种基于层次主题模型的程序特征树的生成方法 |
| US9959364B2 (en) * | 2014-05-22 | 2018-05-01 | Oath Inc. | Content recommendations |
| US20160034757A1 (en) | 2014-07-31 | 2016-02-04 | Chegg, Inc. | Generating an Academic Topic Graph from Digital Documents |
| US11989662B2 (en) * | 2014-10-10 | 2024-05-21 | San Diego State University Research Foundation | Methods and systems for base map and inference mapping |
| US9575952B2 (en) | 2014-10-21 | 2017-02-21 | At&T Intellectual Property I, L.P. | Unsupervised topic modeling for short texts |
| US9697245B1 (en) * | 2015-12-30 | 2017-07-04 | International Business Machines Corporation | Data-dependent clustering of geospatial words |
| US10275444B2 (en) * | 2016-07-15 | 2019-04-30 | At&T Intellectual Property I, L.P. | Data analytics system and methods for text data |
| US11645317B2 (en) * | 2016-07-26 | 2023-05-09 | Qualtrics, Llc | Recommending topic clusters for unstructured text documents |
| US10997509B2 (en) * | 2017-02-14 | 2021-05-04 | Cognitive Scale, Inc. | Hierarchical topic machine learning operation |
| CN108808322A (zh) | 2017-05-04 | 2018-11-13 | 富士康(昆山)电脑接插件有限公司 | 电连接器 |
| CN109544632B (zh) | 2018-11-05 | 2021-08-03 | 浙江工业大学 | 一种基于层次主题模型的语义slam对象关联方法 |
| US12259919B2 (en) | 2019-10-08 | 2025-03-25 | International Business Machines Corporation | Rare topic detection using hierarchical clustering |
-
2019
- 2019-10-08 US US16/596,399 patent/US12259919B2/en active Active
-
2020
- 2020-09-29 AU AU2020364386A patent/AU2020364386B2/en active Active
- 2020-09-29 GB GB2206094.1A patent/GB2604276A/en not_active Withdrawn
- 2020-09-29 KR KR1020227008090A patent/KR102862150B1/ko active Active
- 2020-09-29 JP JP2022520298A patent/JP7539201B2/ja active Active
- 2020-09-29 WO PCT/IB2020/059112 patent/WO2021070005A1/en not_active Ceased
- 2020-09-29 CN CN202080066389.3A patent/CN114424197B/zh active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110270830A1 (en) * | 2010-04-30 | 2011-11-03 | Palo Alto Research Center Incorporated | System And Method For Providing Multi-Core And Multi-Level Topical Organization In Social Indexes |
| CN103970865A (zh) * | 2014-05-08 | 2014-08-06 | 清华大学 | 基于种子词的微博文本层次主题发现方法及系统 |
Also Published As
| Publication number | Publication date |
|---|---|
| AU2020364386A1 (en) | 2022-03-24 |
| KR20220050915A (ko) | 2022-04-25 |
| CN114424197B (zh) | 2025-05-13 |
| WO2021070005A1 (en) | 2021-04-15 |
| US12259919B2 (en) | 2025-03-25 |
| AU2020364386B2 (en) | 2024-01-04 |
| GB2604276A (en) | 2022-08-31 |
| GB202206094D0 (en) | 2022-06-08 |
| CN114424197A (zh) | 2022-04-29 |
| JP7539201B2 (ja) | 2024-08-23 |
| US20210103608A1 (en) | 2021-04-08 |
| JP2022552140A (ja) | 2022-12-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102862150B1 (ko) | 계층적 클러스터링을 사용한 희귀 토픽 탐지 | |
| US11093707B2 (en) | Adversarial training data augmentation data for text classifiers | |
| US11269965B2 (en) | Extractive query-focused multi-document summarization | |
| US11189269B2 (en) | Adversarial training data augmentation for generating related responses | |
| US10621284B2 (en) | Training data update | |
| US11182557B2 (en) | Driving intent expansion via anomaly detection in a modular conversational system | |
| US10956684B2 (en) | Topic kernelization for real-time conversation data | |
| US10929383B2 (en) | Method and system for improving training data understanding in natural language processing | |
| US11645513B2 (en) | Unary relation extraction using distant supervision | |
| US12566983B2 (en) | Machine learning classifiers prediction confidence and explanation | |
| US12093645B2 (en) | Inter-training of pre-trained transformer-based language models using partitioning and classification | |
| US11481442B2 (en) | Leveraging intent resolvers to determine multiple intents | |
| US11227127B2 (en) | Natural language artificial intelligence topology mapping for chatbot communication flow | |
| US20230092274A1 (en) | Training example generation to create new intents for chatbots | |
| US11803374B2 (en) | Monolithic computer application refactoring | |
| US20230186107A1 (en) | Boosting classification and regression tree performance with dimension reduction | |
| US20230161948A1 (en) | Iteratively updating a document structure to resolve disconnected text in element blocks | |
| US11449677B2 (en) | Cognitive hierarchical content distribution | |
| US12596923B2 (en) | Machine learning of keywords | |
| US11270075B2 (en) | Generation of natural language expression variants | |
| US12619884B2 (en) | Artificial intelligence operations adaptive multi-granularity event grouping | |
| US20230222358A1 (en) | Artificial intelligence operations adaptive multi-granularity event grouping |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PA0105 | International application |
St.27 status event code: A-0-1-A10-A15-nap-PA0105 |
|
| A201 | Request for examination | ||
| PA0201 | Request for examination |
St.27 status event code: A-1-2-D10-D11-exm-PA0201 |
|
| PG1501 | Laying open of application |
St.27 status event code: A-1-1-Q10-Q12-nap-PG1501 |
|
| P22-X000 | Classification modified |
St.27 status event code: A-2-2-P10-P22-nap-X000 |
|
| E902 | Notification of reason for refusal | ||
| PE0902 | Notice of grounds for rejection |
St.27 status event code: A-1-2-D10-D21-exm-PE0902 |
|
| P11-X000 | Amendment of application requested |
St.27 status event code: A-2-2-P10-P11-nap-X000 |
|
| P13-X000 | Application amended |
St.27 status event code: A-2-2-P10-P13-nap-X000 |
|
| D22 | Grant of ip right intended |
Free format text: ST27 STATUS EVENT CODE: A-1-2-D10-D22-EXM-PE0701 (AS PROVIDED BY THE NATIONAL OFFICE) |
|
| PE0701 | Decision of registration |
St.27 status event code: A-1-2-D10-D22-exm-PE0701 |
|
| F11 | Ip right granted following substantive examination |
Free format text: ST27 STATUS EVENT CODE: A-2-4-F10-F11-EXM-PR0701 (AS PROVIDED BY THE NATIONAL OFFICE) |
|
| PR0701 | Registration of establishment |
St.27 status event code: A-2-4-F10-F11-exm-PR0701 |
|
| PR1002 | Payment of registration fee |
St.27 status event code: A-2-2-U10-U12-oth-PR1002 Fee payment year number: 1 |
|
| U12 | Designation fee paid |
Free format text: ST27 STATUS EVENT CODE: A-2-2-U10-U12-OTH-PR1002 (AS PROVIDED BY THE NATIONAL OFFICE) Year of fee payment: 1 |
|
| PG1601 | Publication of registration |
St.27 status event code: A-4-4-Q10-Q13-nap-PG1601 |
|
| Q13 | Ip right document published |
Free format text: ST27 STATUS EVENT CODE: A-4-4-Q10-Q13-NAP-PG1601 (AS PROVIDED BY THE NATIONAL OFFICE) |