CN115461811A - 用于多方交互的多模态波束成形和注意力过滤 - Google Patents

用于多方交互的多模态波束成形和注意力过滤 Download PDF

Info

Publication number
CN115461811A
CN115461811A CN202180031587.0A CN202180031587A CN115461811A CN 115461811 A CN115461811 A CN 115461811A CN 202180031587 A CN202180031587 A CN 202180031587A CN 115461811 A CN115461811 A CN 115461811A
Authority
CN
China
Prior art keywords
computing device
user
users
implementations
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180031587.0A
Other languages
English (en)
Chinese (zh)
Inventor
保罗·皮尔詹尼
马里奥·E·米尼希
斯蒂芬·谢勒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Representational Ltd
Original Assignee
Representational Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Representational Ltd filed Critical Representational Ltd
Publication of CN115461811A publication Critical patent/CN115461811A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services
    • G06Q30/015Providing customer assistance, e.g. assisting a customer within a business location or via helpdesk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/176Dynamic expression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/24Speech recognition using non-acoustical features
    • G10L15/25Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02087Noise filtering the noise being separate speech, e.g. cocktail party
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Strategic Management (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Signal Processing (AREA)
  • Ophthalmology & Optometry (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Computing Systems (AREA)
  • Primary Health Care (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • User Interface Of Digital Computer (AREA)
  • Manipulator (AREA)
CN202180031587.0A 2020-02-29 2021-02-28 用于多方交互的多模态波束成形和注意力过滤 Pending CN115461811A (zh)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US202062983595P 2020-02-29 2020-02-29
US62/983,595 2020-02-29
US202163154727P 2021-02-27 2021-02-27
US63/154,727 2021-02-27
PCT/US2021/020148 WO2021174162A1 (fr) 2020-02-29 2021-02-28 Formation de faisceau multimodale et filtrage d'attention pour interactions multiparties

Publications (1)

Publication Number Publication Date
CN115461811A true CN115461811A (zh) 2022-12-09

Family

ID=77491638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180031587.0A Pending CN115461811A (zh) 2020-02-29 2021-02-28 用于多方交互的多模态波束成形和注意力过滤

Country Status (4)

Country Link
US (1) US20220180887A1 (fr)
EP (1) EP4111446A4 (fr)
CN (1) CN115461811A (fr)
WO (1) WO2021174162A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220284915A1 (en) * 2021-03-04 2022-09-08 Orcam Technologies Ltd. Separation of signals based on direction of arrival
WO2024064468A1 (fr) * 2022-09-20 2024-03-28 Qualcomm Incorporated Interface utilisateur vocale assistée par détection de radiofréquence

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150314454A1 (en) * 2013-03-15 2015-11-05 JIBO, Inc. Apparatus and methods for providing a persistent companion device
US11250844B2 (en) * 2017-04-12 2022-02-15 Soundhound, Inc. Managing agent engagement in a man-machine dialog
EP3642835A4 (fr) * 2017-08-03 2021-01-06 Telepathy Labs, Inc. Agent virtuel proactif, intelligent et omnicanal
EP3752957A4 (fr) * 2018-02-15 2021-11-17 DMAI, Inc. Système et procédé de compréhension de la parole par l'intermédiaire d'une reconnaissance vocale basée sur des signaux audio et visuel intégrés

Also Published As

Publication number Publication date
EP4111446A4 (fr) 2024-04-17
US20220180887A1 (en) 2022-06-09
WO2021174162A1 (fr) 2021-09-02
EP4111446A1 (fr) 2023-01-04

Similar Documents

Publication Publication Date Title
US20220012470A1 (en) Multi-user intelligent assistance
KR102334942B1 (ko) 돌봄 로봇을 위한 데이터 처리 방법 및 장치
US11010601B2 (en) Intelligent assistant device communicating non-verbal cues
JP2021057057A (ja) 精神障害の療法のためのモバイルおよびウェアラブルビデオ捕捉およびフィードバックプラットフォーム
US11407106B2 (en) Electronic device capable of moving and operating method thereof
US20220241985A1 (en) Systems and methods to manage conversation interactions between a user and a robot computing device or conversation agent
JP2021503625A (ja) 対話セッション管理用のシステム及び方法
US20220180887A1 (en) Multimodal beamforming and attention filtering for multiparty interactions
US20240152705A1 (en) Systems And Methods For Short- and Long- Term Dialog Management Between A Robot Computing Device/Digital Companion And A User
US20200143235A1 (en) System and method for providing smart objects virtual communication
US20230274743A1 (en) Methods and systems enabling natural language processing, understanding, and generation
US20220207426A1 (en) Method of semi-supervised data collection and machine learning leveraging distributed computing devices
CN111919250B (zh) 传达非语言提示的智能助理设备
Saxena et al. Virtual Assistant with Facial Expession Recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination