JP2023520420A - チャットボットのために不均衡なトレーニングデータを取り扱うためのバッチング技術 - Google Patents

チャットボットのために不均衡なトレーニングデータを取り扱うためのバッチング技術 Download PDF

Info

Publication number
JP2023520420A
JP2023520420A JP2022559638A JP2022559638A JP2023520420A JP 2023520420 A JP2023520420 A JP 2023520420A JP 2022559638 A JP2022559638 A JP 2022559638A JP 2022559638 A JP2022559638 A JP 2022559638A JP 2023520420 A JP2023520420 A JP 2023520420A
Authority
JP
Japan
Prior art keywords
intent
utterances
batch
proportions
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2022559638A
Other languages
English (en)
Japanese (ja)
Other versions
JP2023520420A5 (https=
Inventor
ドゥオング,タン・ロング
ジョンソン,マーク・エドワード
ビシュノイ,ビシャル
ビナコタ,シュリニバス
ホング,ユ-ヘング
ジャラルッディン,エリアス・ルクマン
Original Assignee
オラクル・インターナショナル・コーポレイション
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by オラクル・インターナショナル・コーポレイション filed Critical オラクル・インターナショナル・コーポレイション
Publication of JP2023520420A publication Critical patent/JP2023520420A/ja
Publication of JP2023520420A5 publication Critical patent/JP2023520420A5/ja
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1815Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
JP2022559638A 2020-03-30 2021-03-30 チャットボットのために不均衡なトレーニングデータを取り扱うためのバッチング技術 Pending JP2023520420A (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202063002151P 2020-03-30 2020-03-30
US63/002,151 2020-03-30
PCT/US2021/024946 WO2021202569A1 (en) 2020-03-30 2021-03-30 Batching techniques for handling unbalanced training data for a chatbot

Publications (2)

Publication Number Publication Date
JP2023520420A true JP2023520420A (ja) 2023-05-17
JP2023520420A5 JP2023520420A5 (https=) 2024-03-29

Family

ID=77856167

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2022559638A Pending JP2023520420A (ja) 2020-03-30 2021-03-30 チャットボットのために不均衡なトレーニングデータを取り扱うためのバッチング技術

Country Status (5)

Country Link
US (1) US12236321B2 (https=)
EP (1) EP4128011A1 (https=)
JP (1) JP2023520420A (https=)
CN (1) CN115485690A (https=)
WO (1) WO2021202569A1 (https=)

Families Citing this family (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
DE112014000709B4 (de) 2013-02-07 2021-12-30 Apple Inc. Verfahren und vorrichtung zum betrieb eines sprachtriggers für einen digitalen assistenten
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US10460227B2 (en) 2015-05-15 2019-10-29 Apple Inc. Virtual assistant in a communication session
US10331312B2 (en) 2015-09-08 2019-06-25 Apple Inc. Intelligent automated assistant in a media environment
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US12197817B2 (en) 2016-06-11 2025-01-14 Apple Inc. Intelligent device arbitration and control
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
DK201770428A1 (en) 2017-05-12 2019-02-18 Apple Inc. LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
DK180639B1 (en) 2018-06-01 2021-11-04 Apple Inc DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT
DK201870355A1 (en) 2018-06-01 2019-12-16 Apple Inc. VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11651162B2 (en) 2019-04-26 2023-05-16 Oracle International Corporation Composite entity for rule driven acquisition of input data to chatbots
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
DK201970509A1 (en) 2019-05-06 2021-01-15 Apple Inc Spoken notifications
US11227599B2 (en) 2019-06-01 2022-01-18 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11704474B2 (en) * 2020-02-25 2023-07-18 Transposit Corporation Markdown data content with action binding
US11061543B1 (en) 2020-05-11 2021-07-13 Apple Inc. Providing relevant data items based on context
US12301635B2 (en) 2020-05-11 2025-05-13 Apple Inc. Digital assistant hardware abstraction
US11490204B2 (en) 2020-07-20 2022-11-01 Apple Inc. Multi-device audio adjustment coordination
US11438683B2 (en) 2020-07-21 2022-09-06 Apple Inc. User identification using headphones
US11715464B2 (en) * 2020-09-14 2023-08-01 Apple Inc. Using augmentation to create natural language models
WO2022240918A1 (en) * 2021-05-11 2022-11-17 AskWisy, Inc. Intelligent training and education bot
US12147768B2 (en) * 2021-05-18 2024-11-19 International Business Machines Corporation Natural language bias detection in conversational system environments
US12321428B2 (en) * 2021-07-08 2025-06-03 Nippon Telegraph And Telephone Corporation User authentication device, user authentication method, and user authentication computer program
US11763803B1 (en) * 2021-07-28 2023-09-19 Asapp, Inc. System, method, and computer program for extracting utterances corresponding to a user problem statement in a conversation between a human agent and a user
US12609102B2 (en) * 2021-09-30 2026-04-21 Sap Se Training dataset generation for speech-to-text service
US12067363B1 (en) 2022-02-24 2024-08-20 Asapp, Inc. System, method, and computer program for text sanitization
CN114694655B (zh) * 2022-03-28 2025-07-08 南方电网数字企业科技(广东)有限公司 一种针对粤语音频的拓展方法及语音识别方法
US12468895B2 (en) 2022-06-21 2025-11-11 Kore.Ai, Inc. Systems and methods for training a virtual assistant
WO2024238420A1 (en) * 2023-05-12 2024-11-21 Genesys Cloud Services, Inc. Systems and methods for computing intent health for enhancing conversational bots
US12231378B2 (en) * 2023-06-08 2025-02-18 Sap Se Realtime conversation AI insights and deployment
US12367342B1 (en) * 2025-01-15 2025-07-22 Conversational AI Ltd Automated analysis of computerized conversational agent conversational data
CN120395912B (zh) * 2025-07-03 2025-09-12 浙江理工大学 一种任务驱动的通用机器人智能控制方法及系统

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010204966A (ja) * 2009-03-03 2010-09-16 Nippon Telegr & Teleph Corp <Ntt> サンプリング装置、サンプリング方法、サンプリングプログラム、クラス判別装置およびクラス判別システム。
US20200090638A1 (en) * 2018-09-18 2020-03-19 International Business Machines Corporation Intent classification from multiple sources when building a conversational system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7853451B1 (en) * 2003-12-18 2010-12-14 At&T Intellectual Property Ii, L.P. System and method of exploiting human-human data for spoken language understanding systems
US20120262461A1 (en) * 2011-02-17 2012-10-18 Conversive, Inc. System and Method for the Normalization of Text
US9589074B2 (en) * 2014-08-20 2017-03-07 Oracle International Corporation Multidimensional spatial searching for identifying duplicate crash dumps
US10453117B1 (en) * 2016-06-29 2019-10-22 Amazon Technologies, Inc. Determining domains for natural language understanding
US10909980B2 (en) * 2017-02-27 2021-02-02 SKAEL, Inc. Machine-learning digital assistants
US10546583B2 (en) * 2017-08-30 2020-01-28 Amazon Technologies, Inc. Context-based device arbitration
US10617959B2 (en) 2018-01-18 2020-04-14 Moveworks, Inc. Method and system for training a chatbot
US10497366B2 (en) * 2018-03-23 2019-12-03 Servicenow, Inc. Hybrid learning system for natural language understanding
US10977443B2 (en) * 2018-11-05 2021-04-13 International Business Machines Corporation Class balancing for intent authoring using search
WO2020163627A1 (en) * 2019-02-07 2020-08-13 Clinc, Inc. Systems and methods for machine learning-based multi-intent segmentation and classification
US12026468B2 (en) * 2020-11-30 2024-07-02 Oracle International Corporation Out-of-domain data augmentation for natural language processing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010204966A (ja) * 2009-03-03 2010-09-16 Nippon Telegr & Teleph Corp <Ntt> サンプリング装置、サンプリング方法、サンプリングプログラム、クラス判別装置およびクラス判別システム。
US20200090638A1 (en) * 2018-09-18 2020-03-19 International Business Machines Corporation Intent classification from multiple sources when building a conversational system

Also Published As

Publication number Publication date
WO2021202569A1 (en) 2021-10-07
EP4128011A1 (en) 2023-02-08
CN115485690A (zh) 2022-12-16
US20210304075A1 (en) 2021-09-30
US12236321B2 (en) 2025-02-25

Similar Documents

Publication Publication Date Title
US12236321B2 (en) Batching techniques for handling unbalanced training data for a chatbot
US12249314B2 (en) Routing for chatbots
JP7692432B2 (ja) 制約に基づくハイパーパラメータチューニングのための方法およびシステム
JP7851913B2 (ja) テキスト分類についての説明を与えるための技術
JP2023520416A (ja) ドメイン外(ood)検出のための改良された技術
JP7692482B2 (ja) ニューラルネットワークにおける過剰予測のための方法およびシステム
JP2024503517A (ja) 自然言語処理のための多因子モデリング
US11989523B2 (en) Composite entity for rule driven acquisition of input data to chatbots
JP7771196B2 (ja) 自然言語プロセッサのための複数特徴均衡化
KR20240091051A (ko) 문서들로부터의 임베딩된 데이터의 추출을 위한 딥 러닝 기술들
US12112560B2 (en) Usage based resource utilization of training pool for chatbots
US20230136965A1 (en) Prohibiting inconsistent named entity recognition tag sequences
US20240086767A1 (en) Continuous hyper-parameter tuning with automatic domain weight adjustment based on periodic performance checkpoints
KR20240091214A (ko) 데이터로부터의 질문 및 답변 쌍들의 추출을 위한 규칙-기반 기술들
US20250094733A1 (en) Digital assistant using generative artificial intelligence
JP2025530343A (ja) ターゲットベースのハイパーパラメータチューニングにおける目的関数最適化

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20240318

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20240318

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20250325

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20250507

A601 Written request for extension of time

Free format text: JAPANESE INTERMEDIATE CODE: A601

Effective date: 20250806

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20251006

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20251202

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20260319

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20260319