BR112022012975A2 - Sistema e métodos para mixagem automática de áudio para cenas acústicas - Google Patents

Sistema e métodos para mixagem automática de áudio para cenas acústicas

Info

Publication number
BR112022012975A2
BR112022012975A2 BR112022012975A BR112022012975A BR112022012975A2 BR 112022012975 A2 BR112022012975 A2 BR 112022012975A2 BR 112022012975 A BR112022012975 A BR 112022012975A BR 112022012975 A BR112022012975 A BR 112022012975A BR 112022012975 A2 BR112022012975 A2 BR 112022012975A2
Authority
BR
Brazil
Prior art keywords
audio sample
methods
acoustic
impulse response
audio
Prior art date
Application number
BR112022012975A
Other languages
English (en)
Inventor
Wang Yadong
Jois Rao Shilpa
Parthasarathi Murthy
Tacke Kyle
Original Assignee
Netflix Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netflix Inc filed Critical Netflix Inc
Publication of BR112022012975A2 publication Critical patent/BR112022012975A2/pt

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0091Means for obtaining special acoustic effects
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K15/00Acoustics not otherwise provided for
    • G10K15/08Arrangements for producing a reverberation or echo sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/81Detection of presence or absence of voice signals for discriminating voice from music
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/84Detection of presence or absence of voice signals for discriminating voice from noise
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/265Acoustic effect simulation, i.e. volume, spatial, resonance or reverberation effects added to a musical sound, usually by appropriate filtering or delays
    • G10H2210/281Reverberation or echo
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/311Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/307Frequency adjustment, e.g. tone control

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Circuits Of Receivers In General (AREA)

Abstract

SISTEMA E MÉTODOS PARA MIXAGEM AUTOMÁTICA DE ÁUDIO PARA CENAS ACÚSTICAS. O método implementado por computador divulgado pode incluir obter uma amostra de áudio a partir de uma fonte de conteúdo, inserir a amostra de áudio obtida em um modelo de aprendizado de máquina treinado, obter a saída do modelo de aprendizado de máquina treinado, em que a saída é um perfil de um ambiente no qual a amostra de áudio de entrada foi gravada, obter uma resposta de impulso acústico correspondente ao perfil do ambiente no qual a amostra de áudio de entrada foi gravada, obter uma segunda amostra de áudio, processar a resposta de impulso acústico obtida com a segunda amostra de áudio, e inserir um resultado do processamento da resposta de impulso acústico obtida e da segunda amostra de áudio em uma faixa de áudio. Vários outros métodos, sistemas e meios legíveis por computador também são divulgados.
BR112022012975A 2019-12-31 2020-12-31 Sistema e métodos para mixagem automática de áudio para cenas acústicas BR112022012975A2 (pt)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US16/732,142 US11238888B2 (en) 2019-12-31 2019-12-31 System and methods for automatically mixing audio for acoustic scenes
PCT/US2020/067661 WO2021138557A1 (en) 2019-12-31 2020-12-31 System and methods for automatically mixing audio for acoustic scenes

Publications (1)

Publication Number Publication Date
BR112022012975A2 true BR112022012975A2 (pt) 2022-09-13

Family

ID=74285591

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112022012975A BR112022012975A2 (pt) 2019-12-31 2020-12-31 Sistema e métodos para mixagem automática de áudio para cenas acústicas

Country Status (7)

Country Link
US (2) US11238888B2 (pt)
EP (1) EP4085456A1 (pt)
AU (1) AU2020417822B2 (pt)
BR (1) BR112022012975A2 (pt)
CA (1) CA3160724A1 (pt)
MX (1) MX2022008071A (pt)
WO (1) WO2021138557A1 (pt)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11705148B2 (en) * 2021-06-11 2023-07-18 Microsoft Technology Licensing, Llc Adaptive coefficients and samples elimination for circular convolution
US20230230582A1 (en) * 2022-01-20 2023-07-20 Nuance Communications, Inc. Data augmentation system and method for multi-microphone systems

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070213987A1 (en) 2006-03-08 2007-09-13 Voxonic, Inc. Codebook-less speech conversion method and system
US20130151251A1 (en) 2011-12-12 2013-06-13 Advanced Micro Devices, Inc. Automatic dialog replacement by real-time analytic processing
LV14747B (lv) * 2012-04-04 2014-03-20 Sonarworks, Sia Elektroakustisko izstarotāju akustisko parametru korekcijas paņēmiens un iekārta tā realizēšanai
US9449613B2 (en) 2012-12-06 2016-09-20 Audeme Llc Room identification using acoustic features in a recording
US9185199B2 (en) * 2013-03-12 2015-11-10 Google Technology Holdings LLC Method and apparatus for acoustically characterizing an environment in which an electronic device resides
US10063965B2 (en) * 2016-06-01 2018-08-28 Google Llc Sound source estimation using neural networks
WO2018090356A1 (en) 2016-11-21 2018-05-24 Microsoft Technology Licensing, Llc Automatic dubbing method and apparatus
US10991379B2 (en) 2018-06-22 2021-04-27 Babblelabs Llc Data driven audio enhancement
CN109119063B (zh) 2018-08-31 2019-11-22 腾讯科技(深圳)有限公司 视频配音生成方法、装置、设备及存储介质
US11112389B1 (en) * 2019-01-30 2021-09-07 Facebook Technologies, Llc Room acoustic characterization using sensors
US11074925B2 (en) * 2019-11-13 2021-07-27 Adobe Inc. Generating synthetic acoustic impulse responses from an acoustic impulse response
US11545134B1 (en) * 2019-12-10 2023-01-03 Amazon Technologies, Inc. Multilingual speech translation with adaptive speech synthesis and adaptive physiognomy

Also Published As

Publication number Publication date
AU2020417822A1 (en) 2022-06-23
EP4085456A1 (en) 2022-11-09
MX2022008071A (es) 2022-07-27
US11238888B2 (en) 2022-02-01
AU2020417822B2 (en) 2023-07-06
US20220115030A1 (en) 2022-04-14
WO2021138557A1 (en) 2021-07-08
US20210201931A1 (en) 2021-07-01
CA3160724A1 (en) 2021-07-08

Similar Documents

Publication Publication Date Title
ZA202109128B (en) Systems and methods for producing a product
BR112022012975A2 (pt) Sistema e métodos para mixagem automática de áudio para cenas acústicas
MX2017015844A (es) Sistema y metodo para la generacion de una interfaz de usuario adaptable en un sistema de construccion de sitios web.
SG11201802373WA (en) Method and device for processing question clustering in automatic question and answering system
WO2016199160A3 (en) Language processing and knowledge building system
BR112019002827A2 (pt) sistema, método para processar levantamentos das vsp em tempo real e sistema de processamento de informação comunicativamente acoplado a um sistema de coleta de dados de detecção acústica distribuída
NZ725145A (en) Methods and systems for managing dialogs of a robot
EP3913542A3 (en) Method and apparatus of training model, device, medium, and program product
BR112016023920A2 (pt) métodos e sistemas para lidar com um diálogo com um robô
BR112017003893A8 (pt) Rede dnn aluno aprendiz via distribuição de saída
WO2009035108A1 (ja) 対応関係学習装置および方法ならびに対応関係学習用プログラム、アノテーション装置および方法ならびにアノテーション用プログラム、および、リトリーバル装置および方法ならびにリトリーバル用プログラム
DE502008003378D1 (de) Vorrichtung und verfahren zum erzeugen eines multikanalsignals mit einer sprachsignalverarbeitung
NZ700273A (en) Negative example (anti-word) based performance improvement for speech recognition
SG11201811808VA (en) Database data modification request processing method and apparatus
BR112022004014A2 (pt) Pré-processamento automático para tradução de caixa preta
BR112018069942A2 (pt) método para a inicialização de um sistema de barramento e sistema de barramento
BR112023018274A2 (pt) Sistemas e métodos para processar imagens eletrônicas para determinar teste de amostras não coradas
CO2017008229A2 (es) Un medio legible por computador y sistema para una ejecución externalizada del editor de método de entrada
Elvin et al. Perception of Brazilian Portuguese vowels by Australian English and Spanish listeners
Putri et al. The use of slang among american youths as related to the rise of hip hop culture: a sociolinguistics analysis
Nistor et al. Virtual communities of practice in academia: Automated analysis of collaboration based on the social knowledge-building model
Lleida et al. An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
Astuti The Use of Audio-Lingual Method in BIPA Learning for Foreign Students in West Sumatra in the Era of the Digital Revolution
Sari et al. THE EFFECTS OF SHORT STORY THROUGH WATTPAD AND CRITICAL THINKING ON READING COMPREHENSION ACHIEVEMENT OF NON-ENGLISH MAJOR STUDENTS OF BINA DARMA UNIVERSITY
Maha et al. The Effect of Applying POSSE (Predict-Organize-Search-Summarize-Evaluate) on the Students’ Reading Comprehension