JP7561836B2 - 自然言語処理のためのストップワードデータ拡張 - Google Patents

自然言語処理のためのストップワードデータ拡張 Download PDF

Info

Publication number
JP7561836B2
JP7561836B2 JP2022516740A JP2022516740A JP7561836B2 JP 7561836 B2 JP7561836 B2 JP 7561836B2 JP 2022516740 A JP2022516740 A JP 2022516740A JP 2022516740 A JP2022516740 A JP 2022516740A JP 7561836 B2 JP7561836 B2 JP 7561836B2
Authority
JP
Japan
Prior art keywords
utterance
utterances
intent
training set
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2022516740A
Other languages
English (en)
Japanese (ja)
Other versions
JP2022547631A5 (https=
JP2022547631A (ja
Inventor
ビシュノイ,ビシャル
ジョンソン,マーク・エドワード
ルクマン ジャラルッディン,エリアス・
ビナコタ,バラコタ・シュリニバス
ドゥオング,タン・ロング
シンガラジュ,ゴータム
Original Assignee
オラクル・インターナショナル・コーポレイション
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by オラクル・インターナショナル・コーポレイション filed Critical オラクル・インターナショナル・コーポレイション
Publication of JP2022547631A publication Critical patent/JP2022547631A/ja
Publication of JP2022547631A5 publication Critical patent/JP2022547631A5/ja
Application granted granted Critical
Publication of JP7561836B2 publication Critical patent/JP7561836B2/ja
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3338Query expansion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/02User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail using automatic reactions or user delegation, e.g. automatic replies or chatbot-generated messages

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
JP2022516740A 2019-09-16 2020-09-11 自然言語処理のためのストップワードデータ拡張 Active JP7561836B2 (ja)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962901203P 2019-09-16 2019-09-16
US62/901,203 2019-09-16
PCT/US2020/050407 WO2021055247A1 (en) 2019-09-16 2020-09-11 Stop word data augmentation for natural language processing

Publications (3)

Publication Number Publication Date
JP2022547631A JP2022547631A (ja) 2022-11-14
JP2022547631A5 JP2022547631A5 (https=) 2023-08-25
JP7561836B2 true JP7561836B2 (ja) 2024-10-04

Family

ID=72659345

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2022516740A Active JP7561836B2 (ja) 2019-09-16 2020-09-11 自然言語処理のためのストップワードデータ拡張

Country Status (5)

Country Link
US (1) US11651768B2 (https=)
EP (1) EP4032004A1 (https=)
JP (1) JP7561836B2 (https=)
CN (1) CN114424185B (https=)
WO (1) WO2021055247A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI915279B (zh) 2025-07-04 2026-02-11 犀動智能科技股份有限公司 需求資訊處理方法及系統與電腦程式產品

Families Citing this family (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
DE112014000709B4 (de) 2013-02-07 2021-12-30 Apple Inc. Verfahren und vorrichtung zum betrieb eines sprachtriggers für einen digitalen assistenten
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US10460227B2 (en) 2015-05-15 2019-10-29 Apple Inc. Virtual assistant in a communication session
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10331312B2 (en) 2015-09-08 2019-06-25 Apple Inc. Intelligent automated assistant in a media environment
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US12197817B2 (en) 2016-06-11 2025-01-14 Apple Inc. Intelligent device arbitration and control
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
DK201770428A1 (en) 2017-05-12 2019-02-18 Apple Inc. LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
GB2569335B (en) * 2017-12-13 2022-07-27 Sage Global Services Ltd Chatbot system
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
DK180639B1 (en) 2018-06-01 2021-11-04 Apple Inc DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT
DK201870355A1 (en) 2018-06-01 2019-12-16 Apple Inc. VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
DK201970509A1 (en) 2019-05-06 2021-01-15 Apple Inc Spoken notifications
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11227599B2 (en) 2019-06-01 2022-01-18 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11222630B1 (en) * 2019-09-19 2022-01-11 Amazon Technologies, Inc. Detecting false accepts in a shopping domain for handling a spoken dialog
US11593608B2 (en) * 2019-10-28 2023-02-28 Paypal, Inc. Systems and methods for predicting and providing automated online chat assistance
US11321532B2 (en) * 2019-12-17 2022-05-03 Microsoft Technology Licensing, Llc Conversational manifests for enabling complex bot communications
US11741140B2 (en) 2019-12-17 2023-08-29 Microsoft Technology Licensing, Llc Marketplace for conversational bot skills
WO2021134432A1 (en) * 2019-12-31 2021-07-08 Paypal, Inc. Framework for managing natural language processing tools
US11909698B2 (en) * 2020-01-17 2024-02-20 Bitonic Technology Labs, Inc. Method and system for identifying ideal virtual assistant bots for providing response to user queries
US11316806B1 (en) * 2020-01-28 2022-04-26 Snap Inc. Bulk message deletion
CN111414731B (zh) * 2020-02-28 2023-08-11 北京小米松果电子有限公司 文本标注方法和装置
US12301635B2 (en) 2020-05-11 2025-05-13 Apple Inc. Digital assistant hardware abstraction
US11061543B1 (en) 2020-05-11 2021-07-13 Apple Inc. Providing relevant data items based on context
US11490204B2 (en) 2020-07-20 2022-11-01 Apple Inc. Multi-device audio adjustment coordination
US11438683B2 (en) 2020-07-21 2022-09-06 Apple Inc. User identification using headphones
US11715464B2 (en) * 2020-09-14 2023-08-01 Apple Inc. Using augmentation to create natural language models
US11893354B2 (en) 2021-03-25 2024-02-06 Cognizant Technology Solutions India Pvt. Ltd. System and method for improving chatbot training dataset
EP4315320A4 (en) * 2021-03-30 2025-01-15 Five9, Inc. SYSTEMS AND METHODS FOR TRAINING NATURAL LANGUAGE PROCESSING MODELS IN A CONTACT CENTER
US12026471B2 (en) * 2021-04-16 2024-07-02 Accenture Global Solutions Limited Automated generation of chatbot
US11663421B2 (en) * 2021-04-27 2023-05-30 Jpmorgan Chase Bank, N.A. Systems and methods for intent-based natural language processing
US12321428B2 (en) * 2021-07-08 2025-06-03 Nippon Telegraph And Telephone Corporation User authentication device, user authentication method, and user authentication computer program
US12468938B2 (en) * 2021-09-21 2025-11-11 International Business Machines Corporation Training example generation to create new intents for chatbots
CN114881035B (zh) * 2022-05-13 2023-07-25 平安科技(深圳)有限公司 训练数据的增广方法、装置、设备和存储介质
US12579448B2 (en) * 2022-06-22 2026-03-17 Oracle International Corporation Techniques for positive entity aware augmentation using two-stage augmentation
US12288031B2 (en) * 2022-07-13 2025-04-29 Adp, Inc. Filtering user intent eligibility
US12499385B2 (en) * 2022-08-22 2025-12-16 Oracle International Corporation Adaptive training data augmentation to facilitate training named entity recognition models
US20240169165A1 (en) * 2022-11-17 2024-05-23 Samsung Electronics Co., Ltd. Automatically Generating Annotated Ground-Truth Corpus for Training NLU Model
KR20240076977A (ko) * 2022-11-24 2024-05-31 고려대학교 산학협력단 개체 유형 및 관계 정보에 대한 프롬프트 및 빈칸 추론을 이용한 대화 관계 추출 방법 및 장치
US20240185369A1 (en) * 2022-12-05 2024-06-06 Capital One Services, Llc Biasing machine learning model outputs
US12573391B2 (en) * 2023-03-22 2026-03-10 Meta Platforms, Inc. Generating contextual responses for out-of-coverage requests for assistant systems
TWI897311B (zh) * 2023-03-22 2025-09-11 宏達國際電子股份有限公司 語言處理方法以及語言處理系統
US12608562B2 (en) * 2023-09-21 2026-04-21 Google Llc Providing personalized prompts to users based on documents in cloud storage
TWI882526B (zh) * 2023-11-14 2025-05-01 開曼群島商沛嘻科技股份有限公司 智能對話導入方法與系統
CN118427309B (zh) * 2024-07-03 2024-08-27 云储新能源科技有限公司 一种基于自然语言交互的储能管理系统参数提取方法
US12367342B1 (en) * 2025-01-15 2025-07-22 Conversational AI Ltd Automated analysis of computerized conversational agent conversational data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016067418A1 (ja) 2014-10-30 2016-05-06 三菱電機株式会社 対話制御装置および対話制御方法
WO2016151698A1 (ja) 2015-03-20 2016-09-29 株式会社 東芝 対話装置、方法及びプログラム
US20170116177A1 (en) 2015-10-26 2017-04-27 24/7 Customer, Inc. Method and apparatus for facilitating customer intent prediction
CN109241533A (zh) 2018-09-06 2019-01-18 科大国创软件股份有限公司 一种基于自然语言处理的语意理解系统及方法
US20190280992A1 (en) 2018-03-08 2019-09-12 Andre Arzumanyan Intelligent Apparatus and Method for Responding to Text Messages

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5511161B2 (ja) * 2008-07-25 2014-06-04 シャープ株式会社 情報処理装置および情報処理方法
WO2017112813A1 (en) * 2015-12-22 2017-06-29 Sri International Multi-lingual virtual personal assistant
US11158311B1 (en) * 2017-08-14 2021-10-26 Guangsheng Zhang System and methods for machine understanding of human intentions
CN107515857B (zh) 2017-08-31 2020-08-18 科大讯飞股份有限公司 基于定制技能的语义理解方法及系统
CN107862027B (zh) * 2017-10-31 2019-03-12 北京小度信息科技有限公司 检索意图识别方法、装置、电子设备及可读存储介质
US11093707B2 (en) * 2019-01-15 2021-08-17 International Business Machines Corporation Adversarial training data augmentation data for text classifiers
WO2020163627A1 (en) * 2019-02-07 2020-08-13 Clinc, Inc. Systems and methods for machine learning-based multi-intent segmentation and classification

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016067418A1 (ja) 2014-10-30 2016-05-06 三菱電機株式会社 対話制御装置および対話制御方法
WO2016151698A1 (ja) 2015-03-20 2016-09-29 株式会社 東芝 対話装置、方法及びプログラム
US20170116177A1 (en) 2015-10-26 2017-04-27 24/7 Customer, Inc. Method and apparatus for facilitating customer intent prediction
US20190280992A1 (en) 2018-03-08 2019-09-12 Andre Arzumanyan Intelligent Apparatus and Method for Responding to Text Messages
CN109241533A (zh) 2018-09-06 2019-01-18 科大国创软件股份有限公司 一种基于自然语言处理的语意理解系统及方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI915279B (zh) 2025-07-04 2026-02-11 犀動智能科技股份有限公司 需求資訊處理方法及系統與電腦程式產品

Also Published As

Publication number Publication date
CN114424185A (zh) 2022-04-29
WO2021055247A1 (en) 2021-03-25
JP2022547631A (ja) 2022-11-14
CN114424185B (zh) 2026-01-02
US20210082400A1 (en) 2021-03-18
EP4032004A1 (en) 2022-07-27
US11651768B2 (en) 2023-05-16

Similar Documents

Publication Publication Date Title
JP7561836B2 (ja) 自然言語処理のためのストップワードデータ拡張
JP7682202B2 (ja) ドメイン外(ood)検出のための改良された技術
JP7703667B2 (ja) 固有表現認識モデルを用いたコンテキストタグ統合
JP7721559B2 (ja) 自然言語処理のためのノイズデータ拡張
JP7789778B2 (ja) 自然言語処理のためのドメイン外データ拡張
US12512091B2 (en) Fine-tuning multi-head network from a single transformer layer of pre-trained language model
JP7828346B2 (ja) 自然言語処理のためのキーワードデータ拡張ツール
JP2024540111A (ja) 文書からの埋め込まれるデータの抽出のための深層学習技術
JP2024543062A (ja) 自然言語処理のパスのドロップアウト
KR20250029146A (ko) 개체-인식 데이터 증강을 위한 기술들
JP2025528391A (ja) 名前付きエンティティ認識モデルの訓練を容易にするための適応的訓練データ拡大

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20230817

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20230817

TRDD Decision of grant or rejection written
A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20240821

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20240827

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20240924

R150 Certificate of patent or registration of utility model

Ref document number: 7561836

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150