CN114365119A - 在聊天机器人系统中检测不相关的话语 - Google Patents

在聊天机器人系统中检测不相关的话语 Download PDF

Info

Publication number
CN114365119A
CN114365119A CN202080063757.9A CN202080063757A CN114365119A CN 114365119 A CN114365119 A CN 114365119A CN 202080063757 A CN202080063757 A CN 202080063757A CN 114365119 A CN114365119 A CN 114365119A
Authority
CN
China
Prior art keywords
training
input
feature vector
robot
skill
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202080063757.9A
Other languages
English (en)
Chinese (zh)
Inventor
C·C·潘
G·辛拉朱
V·韦氏诺一
S·P·K·加德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
Oracle International Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oracle International Corp filed Critical Oracle International Corp
Publication of CN114365119A publication Critical patent/CN114365119A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/02User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail using automatic reactions or user delegation, e.g. automatic replies or chatbot-generated messages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Manipulator (AREA)
CN202080063757.9A 2019-09-12 2020-09-11 在聊天机器人系统中检测不相关的话语 Pending CN114365119A (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US201962899700P 2019-09-12 2019-09-12
US62/899,700 2019-09-12
US17/017,076 US11928430B2 (en) 2019-09-12 2020-09-10 Detecting unrelated utterances in a chatbot system
US17/017,076 2020-09-10
PCT/US2020/050429 WO2021050891A1 (en) 2019-09-12 2020-09-11 Detecting unrelated utterances in a chatbot system

Publications (1)

Publication Number Publication Date
CN114365119A true CN114365119A (zh) 2022-04-15

Family

ID=73014584

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202080063757.9A Pending CN114365119A (zh) 2019-09-12 2020-09-11 在聊天机器人系统中检测不相关的话语

Country Status (5)

Country Link
US (2) US11928430B2 (https=)
EP (1) EP4028931A1 (https=)
JP (2) JP7653419B2 (https=)
CN (1) CN114365119A (https=)
WO (1) WO2021050891A1 (https=)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11928430B2 (en) * 2019-09-12 2024-03-12 Oracle International Corporation Detecting unrelated utterances in a chatbot system
US11711323B2 (en) * 2019-11-20 2023-07-25 Medallia, Inc. Systems and methods for managing bot-generated interactions
US11487948B2 (en) * 2020-02-19 2022-11-01 Conduent Business Services, Llc Method and system for automated autonomous intent mining
US20220044150A1 (en) * 2020-08-05 2022-02-10 Nielsen Consumer Llc Systems, methods, and apparatus to classify personalized data
US12099816B2 (en) * 2021-01-20 2024-09-24 Oracle International Corporation Multi-factor modelling for natural language processing
US11715469B2 (en) * 2021-02-26 2023-08-01 Walmart Apollo, Llc Methods and apparatus for improving search retrieval using inter-utterance context
US20220318679A1 (en) * 2021-03-31 2022-10-06 Jio Platforms Limited Multi-faceted bot system and method thereof
US12373893B2 (en) * 2021-06-30 2025-07-29 Allstate Insurance Company Chatbot system and machine learning modules for query analysis and interface generation
US12321428B2 (en) * 2021-07-08 2025-06-03 Nippon Telegraph And Telephone Corporation User authentication device, user authentication method, and user authentication computer program
US11568276B1 (en) * 2021-08-25 2023-01-31 International Business Machines Corporation Adaptive document understanding
US12530536B2 (en) * 2022-05-19 2026-01-20 Google Llc Mixture-of-expert approach to reinforcement learning-based dialogue management
US20230386450A1 (en) * 2022-05-25 2023-11-30 Samsung Electronics Co., Ltd. System and method for detecting unhandled applications in contrastive siamese network training
US20230401416A1 (en) * 2022-06-10 2023-12-14 Truist Bank Leveraging multiple disparate machine learning model data outputs to generate recommendations for the next best action
US12282740B2 (en) * 2022-10-31 2025-04-22 Zoom Communications, Inc. Distributed computing architecture for intent matching
US12450262B2 (en) * 2022-12-18 2025-10-21 Concentric Software, Inc. Method and electronic device to assign appropriate semantic categories to documents with arbitrary granularity
US20240256784A1 (en) * 2023-01-31 2024-08-01 Microsoft Technology Licensing, Llc Extensible chatbot framework
US12541785B2 (en) 2023-03-03 2026-02-03 State Farm Mutual Automobile Insurance Company Chatbot to assist in vehicle shopping
US20240330504A1 (en) 2023-04-03 2024-10-03 State Farm Mutual Automobile Insurance Company Generative Artificial Intelligence for Privacy Inspection and Enforcement of Unstructured Data
US20240346256A1 (en) * 2023-04-12 2024-10-17 Microsoft Technology Licensing, Llc Response generation using a retrieval augmented ai model
US12608689B2 (en) 2023-05-25 2026-04-21 State Farm Mutual Automobile Insurance Company Generating social media content for a user associated with an enterprise
EP4649408A1 (en) * 2023-11-29 2025-11-19 Dazz, Inc. Techniques for cross-source alert prioritization and remediation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106062871A (zh) * 2014-03-28 2016-10-26 英特尔公司 使用所选择的群组样本子集来训练分类器
CN107766559A (zh) * 2017-11-06 2018-03-06 第四范式(北京)技术有限公司 对话模型的训练方法、训练装置、对话方法及对话系统
CN108829818A (zh) * 2018-06-12 2018-11-16 中国科学院计算技术研究所 一种文本分类方法
US20190108836A1 (en) * 2017-10-10 2019-04-11 Toyota Infotechnology Center Co., Ltd. Dialogue system and domain determination method

Family Cites Families (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58223193A (ja) * 1982-06-19 1983-12-24 富士通株式会社 多数単語音声認識方式
US5276766A (en) * 1991-07-16 1994-01-04 International Business Machines Corporation Fast algorithm for deriving acoustic prototypes for automatic speech recognition
US5692100A (en) * 1994-02-02 1997-11-25 Matsushita Electric Industrial Co., Ltd. Vector quantizer
US6393460B1 (en) * 1998-08-28 2002-05-21 International Business Machines Corporation Method and system for informing users of subjects of discussion in on-line chats
JP3892173B2 (ja) * 1999-06-03 2007-03-14 三菱電機株式会社 音声認識装置及び音声認識方法、並びに音声モデル作成装置及び音声モデル作成方法
US20050044487A1 (en) * 2003-08-21 2005-02-24 Apple Computer, Inc. Method and apparatus for automatic file clustering into a data-driven, user-specific taxonomy
JP4191021B2 (ja) * 2003-12-01 2008-12-03 株式会社国際電気通信基礎技術研究所 ドメイン検証器のトレーニング装置、入力データのドメイン検証装置、及びコンピュータプログラム
US7333985B2 (en) * 2003-12-15 2008-02-19 Microsoft Corporation Dynamic content clustering
US7542903B2 (en) * 2004-02-18 2009-06-02 Fuji Xerox Co., Ltd. Systems and methods for determining predictive models of discourse functions
US20060025995A1 (en) * 2004-07-29 2006-02-02 Erhart George W Method and apparatus for natural language call routing using confidence scores
US7970766B1 (en) * 2007-07-23 2011-06-28 Google Inc. Entity type assignment
US9117444B2 (en) * 2012-05-29 2015-08-25 Nuance Communications, Inc. Methods and apparatus for performing transformation techniques for data clustering and/or classification
US9563688B2 (en) * 2014-05-01 2017-02-07 International Business Machines Corporation Categorizing users based on similarity of posed questions, answers and supporting evidence
US20160055240A1 (en) * 2014-08-22 2016-02-25 Microsoft Corporation Orphaned utterance detection system and method
US9647968B2 (en) * 2015-03-25 2017-05-09 Pypestream Inc Systems and methods for invoking chatbots in a channel based communication system
CN106910513A (zh) * 2015-12-22 2017-06-30 微软技术许可有限责任公司 情绪智能聊天引擎
US10204152B2 (en) * 2016-07-21 2019-02-12 Conduent Business Services, Llc Method and system for detecting personal life events of users
JP2018054850A (ja) * 2016-09-28 2018-04-05 株式会社東芝 情報処理システム、情報処理装置、情報処理方法、及びプログラム
CN107977952A (zh) * 2016-10-21 2018-05-01 冯原 医学图像分割方法及装置
US11138388B2 (en) * 2016-12-22 2021-10-05 Verizon Media Inc. Method and system for facilitating a user-machine conversation
US10909980B2 (en) * 2017-02-27 2021-02-02 SKAEL, Inc. Machine-learning digital assistants
US10950228B1 (en) * 2017-06-28 2021-03-16 Amazon Technologies, Inc. Interactive voice controlled entertainment
US10616148B2 (en) * 2017-11-13 2020-04-07 International Business Machines Corporation Progressively extending conversation scope in multi-user messaging platform
KR102059420B1 (ko) * 2017-11-20 2020-02-11 미스터마인드 주식회사 챗봇 트레이너 플랫폼 및 그 운영 방법
US10715470B1 (en) * 2017-12-14 2020-07-14 Amazon Technologies, Inc. Communication account contact ingestion and aggregation
US10812424B1 (en) * 2018-02-05 2020-10-20 Beacon Tech Inc. System and method for quantifying mental health within a group chat application
US10997258B2 (en) * 2018-02-28 2021-05-04 Fujitsu Limited Bot networks
US11018885B2 (en) * 2018-04-19 2021-05-25 Sri International Summarization system
RU2711153C2 (ru) * 2018-05-23 2020-01-15 Общество С Ограниченной Ответственностью "Яндекс" Способы и электронные устройства для определения намерения, связанного с произнесенным высказыванием пользователя
CN108960402A (zh) * 2018-06-11 2018-12-07 上海乐言信息科技有限公司 一种面向聊天机器人的混合策略式情感安抚系统
US10762896B1 (en) * 2018-06-25 2020-09-01 Amazon Technologies, Inc. Wakeword detection
KR102783184B1 (ko) * 2018-06-25 2025-03-19 현대자동차주식회사 대화 시스템 및 대화 처리 방법
US10726830B1 (en) * 2018-09-27 2020-07-28 Amazon Technologies, Inc. Deep multi-channel acoustic modeling
US10861439B2 (en) * 2018-10-22 2020-12-08 Ca, Inc. Machine learning model for identifying offensive, computer-generated natural-language text or speech
US11004454B1 (en) * 2018-11-06 2021-05-11 Amazon Technologies, Inc. Voice profile updating
WO2020123723A1 (en) * 2018-12-11 2020-06-18 K Health Inc. System and method for providing health information
JP6555838B1 (ja) * 2018-12-19 2019-08-07 Jeインターナショナル株式会社 音声問合せシステム、音声問合せ処理方法、スマートスピーカー運用サーバー装置、チャットボットポータルサーバー装置、およびプログラム。
US20200259891A1 (en) * 2019-02-07 2020-08-13 Microsoft Technology Licensing, Llc Facilitating Interaction with Plural BOTs Using a Master BOT Framework
US11146862B2 (en) * 2019-04-16 2021-10-12 Adobe Inc. Generating tags for a digital video
US20200395008A1 (en) * 2019-06-15 2020-12-17 Very Important Puppets Inc. Personality-Based Conversational Agents and Pragmatic Model, and Related Interfaces and Commercial Models
US11928430B2 (en) * 2019-09-12 2024-03-12 Oracle International Corporation Detecting unrelated utterances in a chatbot system
US11657076B2 (en) * 2020-04-07 2023-05-23 American Express Travel Related Services Company, Inc. System for uniform structured summarization of customer chats
US11449683B2 (en) * 2020-11-24 2022-09-20 International Business Machines Corporation Disentanglement of chat utterances

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106062871A (zh) * 2014-03-28 2016-10-26 英特尔公司 使用所选择的群组样本子集来训练分类器
US20190108836A1 (en) * 2017-10-10 2019-04-11 Toyota Infotechnology Center Co., Ltd. Dialogue system and domain determination method
CN107766559A (zh) * 2017-11-06 2018-03-06 第四范式(北京)技术有限公司 对话模型的训练方法、训练装置、对话方法及对话系统
CN108829818A (zh) * 2018-06-12 2018-11-16 中国科学院计算技术研究所 一种文本分类方法

Also Published As

Publication number Publication date
WO2021050891A1 (en) 2021-03-18
US20240169153A1 (en) 2024-05-23
US11928430B2 (en) 2024-03-12
US20210083994A1 (en) 2021-03-18
EP4028931A1 (en) 2022-07-20
JP7653419B2 (ja) 2025-03-28
JP2022547596A (ja) 2022-11-14
JP2025098075A (ja) 2025-07-01

Similar Documents

Publication Publication Date Title
US20240169153A1 (en) Detecting unrelated utterances in a chatbot system
US12299402B2 (en) Techniques for out-of-domain (OOD) detection
CN114424185B (zh) 用于自然语言处理的停用词数据扩充
US12361219B2 (en) Context tag integration with named entity recognition models
CN112487157B (zh) 用于聊天机器人的基于模板的意图分类
CN116583837B (zh) 用于自然语言处理的基于距离的logit值
CN115485690A (zh) 用于处置聊天机器人的不平衡训练数据的分批技术
CN115917553A (zh) 在聊天机器人中实现稳健命名实体识别的实体级数据扩充
JP2023551859A (ja) 自然言語処理のための強化されたロジット
WO2022235353A1 (en) Variant inconsistency attack (via) as a simple and effective adversarial attack method
KR102821062B1 (ko) 사전-트레이닝된 언어 모델들에 대한 긴 텍스트를 핸들링하기 위한 시스템 및 기술들
CN116615727A (zh) 用于自然语言处理的关键词数据扩充工具
CN116490879A (zh) 用于神经网络中过度预测的方法和系统
US12499385B2 (en) Adaptive training data augmentation to facilitate training named entity recognition models
US20260065171A1 (en) Adaptive training data augmentation to facilitate training named entity recognition models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination