BR112021020151A2 - Detector de diálogo - Google Patents

Detector de diálogo

Info

Publication number
BR112021020151A2
BR112021020151A2 BR112021020151A BR112021020151A BR112021020151A2 BR 112021020151 A2 BR112021020151 A2 BR 112021020151A2 BR 112021020151 A BR112021020151 A BR 112021020151A BR 112021020151 A BR112021020151 A BR 112021020151A BR 112021020151 A2 BR112021020151 A2 BR 112021020151A2
Authority
BR
Brazil
Prior art keywords
context
audio
frame
frames
current frame
Prior art date
Application number
BR112021020151A
Other languages
English (en)
Inventor
Lie Lu
Xin Liu
Original Assignee
Dolby Laboratories Licensing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corp filed Critical Dolby Laboratories Licensing Corp
Publication of BR112021020151A2 publication Critical patent/BR112021020151A2/pt

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/81Detection of presence or absence of voice signals for discriminating voice from music
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/007Two-channel systems in which the audio signals are in digital form
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/03Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Auxiliary Devices For Music (AREA)
  • Image Analysis (AREA)

Abstract

detector de diálogo. a presente invenção refere-se a um método de extração de recursos de áudio em um detector de diálogo em resposta a um sinal de áudio de entrada, o método compreendendo a divisão do sinal de áudio de entrada em uma pluralidade de quadros, extração de recursos de áudio de quadro de cada quadro, determinando um conjunto de contexto janelas, cada janela de contexto incluindo um número de quadros em torno de um quadro atual, derivando, para cada janela de contexto, um recurso de áudio de contexto relevante para o quadro atual com base nos recursos de áudio de quadro dos quadros em cada respectivo contexto e concatenando cada áudio de contexto recurso para formar um vetor de recurso combinado para representar o quadro atual. as janelas de contexto com comprimentos diferentes podem melhorar a velocidade de resposta e melhorar a robustez.
BR112021020151A 2019-04-18 2020-04-13 Detector de diálogo BR112021020151A2 (pt)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN2019083173 2019-04-18
US201962840839P 2019-04-30 2019-04-30
EP19192553 2019-08-20
PCT/US2020/028001 WO2020214541A1 (en) 2019-04-18 2020-04-13 A dialog detector

Publications (1)

Publication Number Publication Date
BR112021020151A2 true BR112021020151A2 (pt) 2021-12-14

Family

ID=70480833

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112021020151A BR112021020151A2 (pt) 2019-04-18 2020-04-13 Detector de diálogo

Country Status (7)

Country Link
US (1) US20220199074A1 (pt)
EP (1) EP3956890B1 (pt)
JP (1) JP2022529437A (pt)
KR (1) KR20210154807A (pt)
CN (1) CN113748461A (pt)
BR (1) BR112021020151A2 (pt)
WO (1) WO2020214541A1 (pt)

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6785645B2 (en) * 2001-11-29 2004-08-31 Microsoft Corporation Real-time speech and music classifier
AUPS270902A0 (en) * 2002-05-31 2002-06-20 Canon Kabushiki Kaisha Robust detection and classification of objects in audio using limited training data
KR100883656B1 (ko) * 2006-12-28 2009-02-18 삼성전자주식회사 오디오 신호의 분류 방법 및 장치와 이를 이용한 오디오신호의 부호화/복호화 방법 및 장치
US9196249B1 (en) * 2009-07-02 2015-11-24 Alon Konchitsky Method for identifying speech and music components of an analyzed audio signal
US9401153B2 (en) * 2012-10-15 2016-07-26 Digimarc Corporation Multi-mode audio recognition and auxiliary data encoding and decoding
CN104885151B (zh) * 2012-12-21 2017-12-22 杜比实验室特许公司 用于基于感知准则呈现基于对象的音频内容的对象群集
US9767791B2 (en) * 2013-05-21 2017-09-19 Speech Morphing Systems, Inc. Method and apparatus for exemplary segment classification
EP4379714A2 (en) * 2013-09-12 2024-06-05 Dolby Laboratories Licensing Corporation Loudness adjustment for downmixed audio content
US10181322B2 (en) * 2013-12-20 2019-01-15 Microsoft Technology Licensing, Llc Multi-user, multi-domain dialog system
US9620105B2 (en) * 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
KR102413692B1 (ko) * 2015-07-24 2022-06-27 삼성전자주식회사 음성 인식을 위한 음향 점수 계산 장치 및 방법, 음성 인식 장치 및 방법, 전자 장치
CN117351966A (zh) * 2016-09-28 2024-01-05 华为技术有限公司 一种处理多声道音频信号的方法、装置和系统
CN109215667B (zh) * 2017-06-29 2020-12-22 华为技术有限公司 时延估计方法及装置

Also Published As

Publication number Publication date
EP3956890A1 (en) 2022-02-23
JP2022529437A (ja) 2022-06-22
CN113748461A (zh) 2021-12-03
EP3956890B1 (en) 2024-02-21
WO2020214541A1 (en) 2020-10-22
US20220199074A1 (en) 2022-06-23
KR20210154807A (ko) 2021-12-21

Similar Documents

Publication Publication Date Title
EA201791569A1 (ru) Способ и система определения статуса злокачественной опухоли
SG10201909133YA (en) Systems and methods for matching and scoring sameness
MX2016004674A (es) Sistema y metodo para determinar la secuencia de realizacion de una pluralidad de tareas.
BR112018073635A2 (pt) método de autenticação de identidade e aparelho de autenticação de identidade
BR112019002756A2 (pt) método, aparelho e dispositivo de seleção de recurso
MX361846B (es) Método y aparato para identificación de área.
BR112015022002A2 (pt) aparelho para determinar informações de sinais vitais de um indivíduo, e método para determinar informações de sinais vitais de um indivíduo
GB2511696A (en) Methods, systems, and computer readable media for reducing the impact of false downlink control information (DCI) detection in long term evolution (LTE)
MY159100A (en) Apparatus, system and method for detecting and preventing malicious scripts using code pattern-based static analysis and api flow-based dynamic analysis
BR112015006774A2 (pt) método de detecção de canal de controle e de equipamento de usuário
BR112017020635A2 (pt) determinação de modo de derivação de informações de movimento em conversão de vídeo em código
BR112018010073A2 (pt) monitoramento de cabeça para método e sistema de saída binaural paramétrica
BR112015018597A2 (pt) método e dispositivo para reconhecimento de áudio
MX359781B (es) Método y dispositivo para ocultar información de privacidad.
BR112014030215A2 (pt) métodos e aparelhos para determinar informação de notações para apresentações de mídia online
EP2713314A3 (en) Image processing device and image processing method
EP3182409A3 (en) Determining the inter-channel time difference of a multi-channel audio signal
BR112018008857A2 (pt) dispositivo eletrônico, método de varredura de um canal em um dispositivo eletrônico, e meio de armazenamento legível por computador não transitório
GB2507215A (en) System and method for determining interpersonal relationship influence information using textual content from interpersonal interactions
MX365897B (es) Método y aparato para determinar similitud y terminal.
BR112018010161A8 (pt) sistema e método para avaliar um detector em um dispositivo de imagem
BR112021026664A2 (pt) Corte de vídeo automatizado usando importância relativa de objetos identificados
BR112014028679A2 (pt) tempo de fase de página
BR112015000879A2 (pt) sistema e método para modelagem de velocidade de migração
BR112019005938A8 (pt) Método e sistema para entrega de conteúdo em tempo real