BR112019004524A2 - geração de áudio usando redes nuerais - Google Patents

geração de áudio usando redes nuerais

Info

Publication number
BR112019004524A2
BR112019004524A2 BR112019004524A BR112019004524A BR112019004524A2 BR 112019004524 A2 BR112019004524 A2 BR 112019004524A2 BR 112019004524 A BR112019004524 A BR 112019004524A BR 112019004524 A BR112019004524 A BR 112019004524A BR 112019004524 A2 BR112019004524 A2 BR 112019004524A2
Authority
BR
Brazil
Prior art keywords
time
output
audio data
audio
alternative representation
Prior art date
Application number
BR112019004524A
Other languages
English (en)
Other versions
BR112019004524B1 (pt
Inventor
Gerard Antonius Van Den Oord Aaron
Etienne Lea Dieleman Sander
Emmerich Kalchbrenner Nal
Simonyan Karen
Vinyals Oriol
Original Assignee
Deepmind Technologies Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deepmind Technologies Limited filed Critical Deepmind Technologies Limited
Publication of BR112019004524A2 publication Critical patent/BR112019004524A2/pt
Publication of BR112019004524B1 publication Critical patent/BR112019004524B1/pt

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/311Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Complex Calculations (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

métodos, sistemas e aparelhos, incluindo programas de computador codificados em meios de armazenamento de computador, para gerar uma sequência de saída de dados de áudio que compreende uma amostra de áudio respectiva em cada uma de uma pluralidade de etapas de tempo. um dos métodos inclui, para cada uma das etapas de tempo: fornecer uma sequência atual de dados de áudio como entrada para uma sub a rede convolucional, em que a sequência atual compreende a respectiva amostra de áudio em cada etapa de tempo que precede o intervalo de tempo na sequência de saída, e em que a sub a rede convolucional é configurada para processar a sequência atual de dados de áudio para gerar uma representação alternativa para o intervalo de tempo; e proporcionar a representação alternativa para o passo de tempo como entrada para uma camada de saída, em que a camada de saída é configurada para: processar a representação alternativa para gerar uma saída que define uma distribuição de pontuações ao longo de uma pluralidade de amostras de áudio possíveis para o intervalo de tempo.
BR112019004524-4A 2016-09-06 2017-09-06 Sistema de redes neurais, um ou mais meios de armazenamento legíveis por computador não transitório e método para gerar autorregressivamente uma sequência de saída de dados de áudio BR112019004524B1 (pt)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201662384115P 2016-09-06 2016-09-06
US62/384,115 2016-09-06
PCT/US2017/050320 WO2018048934A1 (en) 2016-09-06 2017-09-06 Generating audio using neural networks

Publications (2)

Publication Number Publication Date
BR112019004524A2 true BR112019004524A2 (pt) 2019-05-28
BR112019004524B1 BR112019004524B1 (pt) 2023-11-07

Family

ID=60022154

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112019004524-4A BR112019004524B1 (pt) 2016-09-06 2017-09-06 Sistema de redes neurais, um ou mais meios de armazenamento legíveis por computador não transitório e método para gerar autorregressivamente uma sequência de saída de dados de áudio

Country Status (9)

Country Link
US (5) US10304477B2 (pt)
EP (2) EP3822863B1 (pt)
JP (3) JP6577159B1 (pt)
KR (1) KR102353284B1 (pt)
CN (2) CN109891434B (pt)
AU (1) AU2017324937B2 (pt)
BR (1) BR112019004524B1 (pt)
CA (2) CA3036067C (pt)
WO (1) WO2018048934A1 (pt)

Families Citing this family (90)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9947316B2 (en) 2016-02-22 2018-04-17 Sonos, Inc. Voice control of a media playback system
US10095470B2 (en) 2016-02-22 2018-10-09 Sonos, Inc. Audio response playback
US10509626B2 (en) 2016-02-22 2019-12-17 Sonos, Inc Handling of loss of pairing between networked devices
US10264030B2 (en) 2016-02-22 2019-04-16 Sonos, Inc. Networked microphone device control
US9772817B2 (en) 2016-02-22 2017-09-26 Sonos, Inc. Room-corrected voice detection
US9965247B2 (en) 2016-02-22 2018-05-08 Sonos, Inc. Voice controlled media playback system based on user profile
US9978390B2 (en) 2016-06-09 2018-05-22 Sonos, Inc. Dynamic player selection for audio signal processing
US10134399B2 (en) 2016-07-15 2018-11-20 Sonos, Inc. Contextualization of voice inputs
US10115400B2 (en) 2016-08-05 2018-10-30 Sonos, Inc. Multiple voice services
CA3036067C (en) * 2016-09-06 2023-08-01 Deepmind Technologies Limited Generating audio using neural networks
JP6750121B2 (ja) * 2016-09-06 2020-09-02 ディープマインド テクノロジーズ リミテッド 畳み込みニューラルネットワークを使用したシーケンスの処理
US11080591B2 (en) * 2016-09-06 2021-08-03 Deepmind Technologies Limited Processing sequences using convolutional neural networks
US9942678B1 (en) 2016-09-27 2018-04-10 Sonos, Inc. Audio playback settings for voice interaction
US10181323B2 (en) 2016-10-19 2019-01-15 Sonos, Inc. Arbitration-based voice recognition
KR102359216B1 (ko) 2016-10-26 2022-02-07 딥마인드 테크놀로지스 리미티드 신경망을 이용한 텍스트 시퀀스 처리
CA3206209A1 (en) 2017-03-29 2018-10-04 Google Llc End-to-end text-to-speech conversion
US10475449B2 (en) 2017-08-07 2019-11-12 Sonos, Inc. Wake-word detection suppression
KR102410820B1 (ko) * 2017-08-14 2022-06-20 삼성전자주식회사 뉴럴 네트워크를 이용한 인식 방법 및 장치 및 상기 뉴럴 네트워크를 트레이닝하는 방법 및 장치
JP7209275B2 (ja) * 2017-08-31 2023-01-20 国立研究開発法人情報通信研究機構 オーディオデータ学習装置、オーディオデータ推論装置、およびプログラム
US10048930B1 (en) 2017-09-08 2018-08-14 Sonos, Inc. Dynamic computation of system response volume
US10446165B2 (en) 2017-09-27 2019-10-15 Sonos, Inc. Robust short-time fourier transform acoustic echo cancellation during audio playback
US10051366B1 (en) 2017-09-28 2018-08-14 Sonos, Inc. Three-dimensional beam forming with a microphone array
US10621981B2 (en) 2017-09-28 2020-04-14 Sonos, Inc. Tone interference cancellation
US10482868B2 (en) 2017-09-28 2019-11-19 Sonos, Inc. Multi-channel acoustic echo cancellation
US10466962B2 (en) 2017-09-29 2019-11-05 Sonos, Inc. Media playback system with voice assistance
US10880650B2 (en) 2017-12-10 2020-12-29 Sonos, Inc. Network microphone devices with automatic do not disturb actuation capabilities
KR102401512B1 (ko) * 2018-01-11 2022-05-25 네오사피엔스 주식회사 기계학습을 이용한 텍스트-음성 합성 방법, 장치 및 컴퓨터 판독가능한 저장매체
US11343614B2 (en) 2018-01-31 2022-05-24 Sonos, Inc. Device designation of playback and network microphone device arrangements
US11175880B2 (en) 2018-05-10 2021-11-16 Sonos, Inc. Systems and methods for voice-assisted media content selection
US10959029B2 (en) 2018-05-25 2021-03-23 Sonos, Inc. Determining and adapting to changes in microphone performance of playback devices
CA3103470A1 (en) 2018-06-12 2019-12-19 Intergraph Corporation Artificial intelligence applications for computer-aided dispatch systems
US10681460B2 (en) 2018-06-28 2020-06-09 Sonos, Inc. Systems and methods for associating playback devices with voice assistant services
US10971170B2 (en) * 2018-08-08 2021-04-06 Google Llc Synthesizing speech from text using neural networks
US10461710B1 (en) 2018-08-28 2019-10-29 Sonos, Inc. Media playback system with maximum volume setting
US11076035B2 (en) 2018-08-28 2021-07-27 Sonos, Inc. Do not disturb feature for audio notifications
US10587430B1 (en) 2018-09-14 2020-03-10 Sonos, Inc. Networked devices, systems, and methods for associating playback devices based on sound codes
US11024331B2 (en) 2018-09-21 2021-06-01 Sonos, Inc. Voice detection optimization using sound metadata
US10811015B2 (en) 2018-09-25 2020-10-20 Sonos, Inc. Voice detection optimization based on selected voice assistant service
WO2020064990A1 (en) 2018-09-27 2020-04-02 Deepmind Technologies Limited Committed information rate variational autoencoders
US11100923B2 (en) 2018-09-28 2021-08-24 Sonos, Inc. Systems and methods for selective wake word detection using neural network models
US10692518B2 (en) 2018-09-29 2020-06-23 Sonos, Inc. Linear filtering for noise-suppressed speech detection via multiple network microphone devices
US11899519B2 (en) 2018-10-23 2024-02-13 Sonos, Inc. Multiple stage network microphone device with reduced power consumption and processing load
US10977872B2 (en) 2018-10-31 2021-04-13 Sony Interactive Entertainment Inc. Graphical style modification for video games using machine learning
US11375293B2 (en) 2018-10-31 2022-06-28 Sony Interactive Entertainment Inc. Textual annotation of acoustic effects
US10854109B2 (en) 2018-10-31 2020-12-01 Sony Interactive Entertainment Inc. Color accommodation for on-demand accessibility
US11636673B2 (en) 2018-10-31 2023-04-25 Sony Interactive Entertainment Inc. Scene annotation using machine learning
EP3654249A1 (en) 2018-11-15 2020-05-20 Snips Dilated convolutions and gating for efficient keyword spotting
US11024321B2 (en) 2018-11-30 2021-06-01 Google Llc Speech coding using auto-regressive generative neural networks
US11183183B2 (en) 2018-12-07 2021-11-23 Sonos, Inc. Systems and methods of operating media playback systems having multiple voice assistant services
US11132989B2 (en) 2018-12-13 2021-09-28 Sonos, Inc. Networked microphone devices, systems, and methods of localized arbitration
CN109771944B (zh) * 2018-12-19 2022-07-12 武汉西山艺创文化有限公司 一种游戏音效生成方法、装置、设备和存储介质
US10602268B1 (en) 2018-12-20 2020-03-24 Sonos, Inc. Optimization of network microphone devices using noise classification
US11869529B2 (en) * 2018-12-26 2024-01-09 Nippon Telegraph And Telephone Corporation Speaking rhythm transformation apparatus, model learning apparatus, methods therefor, and program
US10867604B2 (en) 2019-02-08 2020-12-15 Sonos, Inc. Devices, systems, and methods for distributed voice processing
US11315556B2 (en) 2019-02-08 2022-04-26 Sonos, Inc. Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification
US11587552B2 (en) 2019-04-30 2023-02-21 Sutherland Global Services Inc. Real time key conversational metrics prediction and notability
US11120794B2 (en) 2019-05-03 2021-09-14 Sonos, Inc. Voice assistant persistence across multiple network microphone devices
CN110136731B (zh) * 2019-05-13 2021-12-24 天津大学 空洞因果卷积生成对抗网络端到端骨导语音盲增强方法
KR102579843B1 (ko) * 2019-05-23 2023-09-18 구글 엘엘씨 표현 E2E(end-to-end) 음성 합성에서의 변동 임베딩 용량
JP2020194098A (ja) * 2019-05-29 2020-12-03 ヤマハ株式会社 推定モデル確立方法、推定モデル確立装置、プログラムおよび訓練データ準備方法
US11200894B2 (en) 2019-06-12 2021-12-14 Sonos, Inc. Network microphone device with command keyword eventing
US11361756B2 (en) 2019-06-12 2022-06-14 Sonos, Inc. Conditional wake word eventing based on environment
US10586540B1 (en) 2019-06-12 2020-03-10 Sonos, Inc. Network microphone device with command keyword conditioning
US11138969B2 (en) 2019-07-31 2021-10-05 Sonos, Inc. Locally distributed keyword detection
US10871943B1 (en) 2019-07-31 2020-12-22 Sonos, Inc. Noise classification for event detection
US11138975B2 (en) 2019-07-31 2021-10-05 Sonos, Inc. Locally distributed keyword detection
CN110728991B (zh) * 2019-09-06 2022-03-01 南京工程学院 一种改进的录音设备识别算法
US11781416B2 (en) 2019-10-16 2023-10-10 Saudi Arabian Oil Company Determination of elastic properties of a geological formation using machine learning applied to data acquired while drilling
US11189286B2 (en) 2019-10-22 2021-11-30 Sonos, Inc. VAS toggle based on device orientation
KR20210048310A (ko) 2019-10-23 2021-05-03 삼성전자주식회사 전자 장치 및 그 제어 방법
KR102556096B1 (ko) * 2019-11-29 2023-07-18 한국전자통신연구원 이전 프레임의 정보를 사용한 오디오 신호 부호화/복호화 장치 및 방법
US11200900B2 (en) 2019-12-20 2021-12-14 Sonos, Inc. Offline voice control
US11373095B2 (en) * 2019-12-23 2022-06-28 Jens C. Jenkins Machine learning multiple features of depicted item
US11562740B2 (en) 2020-01-07 2023-01-24 Sonos, Inc. Voice verification for media playback
US11556307B2 (en) 2020-01-31 2023-01-17 Sonos, Inc. Local voice data processing
US11308958B2 (en) 2020-02-07 2022-04-19 Sonos, Inc. Localized wakeword verification
US20210312258A1 (en) * 2020-04-01 2021-10-07 Sony Corporation Computing temporal convolution networks in real time
US20210350788A1 (en) * 2020-05-06 2021-11-11 Samsung Electronics Co., Ltd. Electronic device for generating speech signal corresponding to at least one text and operating method of the electronic device
US11308962B2 (en) 2020-05-20 2022-04-19 Sonos, Inc. Input detection windowing
US11482224B2 (en) 2020-05-20 2022-10-25 Sonos, Inc. Command keywords with input detection windowing
US11727919B2 (en) 2020-05-20 2023-08-15 Sonos, Inc. Memory allocation for keyword spotting engines
EP3719711A3 (en) 2020-07-30 2021-03-03 Institutul Roman De Stiinta Si Tehnologie Method of detecting anomalous data, machine computing unit, computer program
US11698771B2 (en) 2020-08-25 2023-07-11 Sonos, Inc. Vocal guidance engines for playback devices
US11984123B2 (en) 2020-11-12 2024-05-14 Sonos, Inc. Network device interaction by range
US11796714B2 (en) 2020-12-10 2023-10-24 Saudi Arabian Oil Company Determination of mechanical properties of a geological formation using deep learning applied to data acquired while drilling
GB202106969D0 (en) * 2021-05-14 2021-06-30 Samsung Electronics Co Ltd Method and apparatus for improving model efficiency
CN113724683B (zh) * 2021-07-23 2024-03-22 阿里巴巴达摩院(杭州)科技有限公司 音频生成方法、计算机设备及计算机可读存储介质
WO2023177145A1 (ko) * 2022-03-16 2023-09-21 삼성전자주식회사 전자 장치 및 전자 장치의 제어 방법
WO2023219292A1 (ko) * 2022-05-09 2023-11-16 삼성전자 주식회사 장면 분류를 위한 오디오 처리 방법 및 장치
EP4293662A1 (en) * 2022-06-17 2023-12-20 Samsung Electronics Co., Ltd. Method and system for personalising machine learning models

Family Cites Families (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US2810457A (en) * 1953-04-10 1957-10-22 Gen Motors Corp Lubricator
JPH0450121Y2 (pt) 1986-04-30 1992-11-26
JP2522400B2 (ja) * 1989-08-10 1996-08-07 ヤマハ株式会社 楽音波形生成方法
US5377302A (en) 1992-09-01 1994-12-27 Monowave Corporation L.P. System for recognizing speech
WO1995030193A1 (en) 1994-04-28 1995-11-09 Motorola Inc. A method and apparatus for converting text into audible signals using a neural network
JP3270668B2 (ja) * 1995-10-31 2002-04-02 ナショナル サイエンス カウンシル テキストからスピーチへの人工的ニューラルネットワークに基づく韻律の合成装置
US6357176B2 (en) * 1997-03-19 2002-03-19 Mississippi State University Soilless sod
JPH10333699A (ja) * 1997-06-05 1998-12-18 Fujitsu Ltd 音声認識および音声合成装置
US5913194A (en) * 1997-07-14 1999-06-15 Motorola, Inc. Method, device and system for using statistical information to reduce computation and memory requirements of a neural network based speech synthesis system
JPH11282484A (ja) * 1998-03-27 1999-10-15 Victor Co Of Japan Ltd 音声合成装置
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
DE10018134A1 (de) * 2000-04-12 2001-10-18 Siemens Ag Verfahren und Vorrichtung zum Bestimmen prosodischer Markierungen
JP2002123280A (ja) * 2000-10-16 2002-04-26 Seiko Epson Corp 音声合成方法および音声合成装置ならびに音声合成処理プログラムを記録した記録媒体
US7062437B2 (en) * 2001-02-13 2006-06-13 International Business Machines Corporation Audio renderings for expressing non-audio nuances
US20060064177A1 (en) 2004-09-17 2006-03-23 Nokia Corporation System and method for measuring confusion among words in an adaptive speech recognition system
US7747070B2 (en) * 2005-08-31 2010-06-29 Microsoft Corporation Training convolutional neural networks on graphics processing units
KR100832556B1 (ko) * 2006-09-22 2008-05-26 (주)한국파워보이스 강인한 원거리 음성 인식 시스템을 위한 음성 인식 방법
US8504361B2 (en) * 2008-02-07 2013-08-06 Nec Laboratories America, Inc. Deep neural networks and methods for using same
CA2724753A1 (en) * 2008-05-30 2009-12-03 Nokia Corporation Method, apparatus and computer program product for providing improved speech synthesis
FR2950713A1 (fr) 2009-09-29 2011-04-01 Movea Sa Systeme et procede de reconnaissance de gestes
TWI413104B (zh) * 2010-12-22 2013-10-21 Ind Tech Res Inst 可調控式韻律重估測系統與方法及電腦程式產品
CN102651217A (zh) * 2011-02-25 2012-08-29 株式会社东芝 用于合成语音的方法、设备以及用于语音合成的声学模型训练方法
EP2565667A1 (en) 2011-08-31 2013-03-06 Friedrich-Alexander-Universität Erlangen-Nürnberg Direction of arrival estimation using watermarked audio signals and microphone arrays
US8527276B1 (en) * 2012-10-25 2013-09-03 Google Inc. Speech synthesis using deep neural networks
US9230550B2 (en) * 2013-01-10 2016-01-05 Sensory, Incorporated Speaker verification and identification using artificial neural network-based sub-phonetic unit discrimination
US9141906B2 (en) * 2013-03-13 2015-09-22 Google Inc. Scoring concept terms using a deep network
US9147154B2 (en) 2013-03-13 2015-09-29 Google Inc. Classifying resources using a deep network
US9190053B2 (en) 2013-03-25 2015-11-17 The Governing Council Of The Univeristy Of Toronto System and method for applying a convolutional neural network to speech recognition
CA2810457C (en) * 2013-03-25 2018-11-20 Gerald Bradley PENN System and method for applying a convolutional neural network to speech recognition
US20150032449A1 (en) * 2013-07-26 2015-01-29 Nuance Communications, Inc. Method and Apparatus for Using Convolutional Neural Networks in Speech Recognition
CN104681034A (zh) * 2013-11-27 2015-06-03 杜比实验室特许公司 音频信号处理
US9953634B1 (en) 2013-12-17 2018-04-24 Knowles Electronics, Llc Passive training for automatic speech recognition
US10181098B2 (en) 2014-06-06 2019-01-15 Google Llc Generating representations of input sequences using neural networks
US10275704B2 (en) 2014-06-06 2019-04-30 Google Llc Generating representations of input sequences using neural networks
US9821340B2 (en) * 2014-07-28 2017-11-21 Kolo Medical Ltd. High displacement ultrasonic transducer
KR102332729B1 (ko) 2014-07-28 2021-11-30 삼성전자주식회사 발음 유사도를 기반으로 한 음성 인식 방법 및 장치, 음성 인식 엔진 생성 방법 및 장치
US20160035344A1 (en) * 2014-08-04 2016-02-04 Google Inc. Identifying the language of a spoken utterance
WO2016033506A1 (en) 2014-08-29 2016-03-03 Google Inc. Processing images using deep neural networks
US9536509B2 (en) 2014-09-25 2017-01-03 Sunhouse Technologies, Inc. Systems and methods for capturing and interpreting audio
US10783900B2 (en) * 2014-10-03 2020-09-22 Google Llc Convolutional, long short-term memory, fully connected deep neural networks
US9824684B2 (en) 2014-11-13 2017-11-21 Microsoft Technology Licensing, Llc Prediction-based sequence recognition
US9542927B2 (en) * 2014-11-13 2017-01-10 Google Inc. Method and system for building text-to-speech voice from diverse recordings
US9607217B2 (en) * 2014-12-22 2017-03-28 Yahoo! Inc. Generating preference indices for image content
US11080587B2 (en) * 2015-02-06 2021-08-03 Deepmind Technologies Limited Recurrent neural networks for data item generation
US10403269B2 (en) * 2015-03-27 2019-09-03 Google Llc Processing audio waveforms
US20160343366A1 (en) * 2015-05-19 2016-11-24 Google Inc. Speech synthesis model selection
US9595002B2 (en) 2015-05-29 2017-03-14 Sas Institute Inc. Normalizing electronic communications using a vector having a repeating substring as input for a neural network
CN105096939B (zh) * 2015-07-08 2017-07-25 百度在线网络技术(北京)有限公司 语音唤醒方法和装置
US9786270B2 (en) 2015-07-09 2017-10-10 Google Inc. Generating acoustic models
CN106375231B (zh) * 2015-07-22 2019-11-05 华为技术有限公司 一种流量切换方法、设备及系统
KR102413692B1 (ko) 2015-07-24 2022-06-27 삼성전자주식회사 음성 인식을 위한 음향 점수 계산 장치 및 방법, 음성 인식 장치 및 방법, 전자 장치
CN105068998B (zh) 2015-07-29 2017-12-15 百度在线网络技术(北京)有限公司 基于神经网络模型的翻译方法及装置
CN105321525B (zh) * 2015-09-30 2019-02-22 北京邮电大学 一种降低voip通信资源开销的系统和方法
US10733979B2 (en) 2015-10-09 2020-08-04 Google Llc Latency constraints for acoustic modeling
US10395118B2 (en) 2015-10-29 2019-08-27 Baidu Usa Llc Systems and methods for video paragraph captioning using hierarchical recurrent neural networks
WO2017083695A1 (en) * 2015-11-12 2017-05-18 Google Inc. Generating target sequences from input sequences using partial conditioning
US10332509B2 (en) 2015-11-25 2019-06-25 Baidu USA, LLC End-to-end speech recognition
CN105513591B (zh) * 2015-12-21 2019-09-03 百度在线网络技术(北京)有限公司 用lstm循环神经网络模型进行语音识别的方法和装置
US10402700B2 (en) 2016-01-25 2019-09-03 Deepmind Technologies Limited Generating images using neural networks
JP6889728B2 (ja) * 2016-03-11 2021-06-18 マジック リープ, インコーポレイテッドMagic Leap,Inc. 畳み込みニューラルネットワークにおける構造学習
US10460747B2 (en) 2016-05-10 2019-10-29 Google Llc Frequency based audio analysis using neural networks
US9972314B2 (en) 2016-06-01 2018-05-15 Microsoft Technology Licensing, Llc No loss-optimization for weighted transducer
US11373672B2 (en) 2016-06-14 2022-06-28 The Trustees Of Columbia University In The City Of New York Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments
US9984683B2 (en) 2016-07-22 2018-05-29 Google Llc Automatic speech recognition using multi-dimensional models
CA3036067C (en) 2016-09-06 2023-08-01 Deepmind Technologies Limited Generating audio using neural networks
JP6750121B2 (ja) 2016-09-06 2020-09-02 ディープマインド テクノロジーズ リミテッド 畳み込みニューラルネットワークを使用したシーケンスの処理
US11080591B2 (en) 2016-09-06 2021-08-03 Deepmind Technologies Limited Processing sequences using convolutional neural networks
KR102359216B1 (ko) 2016-10-26 2022-02-07 딥마인드 테크놀로지스 리미티드 신경망을 이용한 텍스트 시퀀스 처리
US10049106B2 (en) 2017-01-18 2018-08-14 Xerox Corporation Natural language generation through character-based recurrent neural networks with finite-state prior knowledge
TWI767000B (zh) 2017-05-20 2022-06-11 英商淵慧科技有限公司 產生波形之方法及電腦儲存媒體
US10726858B2 (en) 2018-06-22 2020-07-28 Intel Corporation Neural network for speech denoising trained with deep feature losses
US10971170B2 (en) 2018-08-08 2021-04-06 Google Llc Synthesizing speech from text using neural networks

Also Published As

Publication number Publication date
CN109891434B (zh) 2020-10-30
AU2017324937A1 (en) 2019-03-28
US20190251987A1 (en) 2019-08-15
JP2020003809A (ja) 2020-01-09
US20240135955A1 (en) 2024-04-25
CN109891434A (zh) 2019-06-14
CA3036067C (en) 2023-08-01
JP6577159B1 (ja) 2019-09-18
US20200411032A1 (en) 2020-12-31
JP2019532349A (ja) 2019-11-07
JP2021152664A (ja) 2021-09-30
BR112019004524B1 (pt) 2023-11-07
CA3036067A1 (en) 2018-03-15
AU2017324937B2 (en) 2019-12-19
WO2018048934A1 (en) 2018-03-15
KR20190042730A (ko) 2019-04-24
US11386914B2 (en) 2022-07-12
US10803884B2 (en) 2020-10-13
US11869530B2 (en) 2024-01-09
CN112289342A (zh) 2021-01-29
EP3822863A1 (en) 2021-05-19
US20220319533A1 (en) 2022-10-06
CN112289342B (zh) 2024-03-19
EP3497629A1 (en) 2019-06-19
US10304477B2 (en) 2019-05-28
KR102353284B1 (ko) 2022-01-19
CA3155320A1 (en) 2018-03-15
US20180322891A1 (en) 2018-11-08
JP6891236B2 (ja) 2021-06-18
JP7213913B2 (ja) 2023-01-27
EP3497629B1 (en) 2020-11-04
EP3822863B1 (en) 2022-11-02

Similar Documents

Publication Publication Date Title
BR112019004524A2 (pt) geração de áudio usando redes nuerais
SG10201707702YA (en) Collaborative Voice Controlled Devices
WO2020011284A3 (en) System and method for adding node in blockchain network
AR125773A2 (es) Procesador de datos de audio para decodificadores de audio y/o renderizadores y método para procesar datos de audio
BR112018006194A2 (pt) sistema automatizado de composição e geração de música, processo automatizado de composição e geração de música, composição e geração automatizadas de música, instrumento musical de brinquedo, instrumento de brinquedo de composição de música e acompanhamento de vídeo, sistema automatizado de instrumento de brinquedo de composição e geração de música, sistema eletrônico de processamento e exibição de informações, sistema de composição e geração de música baseado em internet de nível empresarial, rede de sistema para gerar e entregar músicas digitais compostas automaticamente, sistema autônomo de composição e performance de música baseado em inteligência artificial para uso em um ambiente musical, processo autônomo de composição geração e performance de música baseado em inteligência artificial, sistema autônomo de instrumentos de análise, rede para configurar um mecanismo automatizado de composição e geração de música, método de gerenciamento de parâmetros operacionais do sistema de teoria musical, método de compor e gerar uma música digital de forma automatizada, subsistema de mecanismo de transformação de parâmetro, método de compor e gerar música, método de processamento de expressão de letra e método de processamento da entrada de expressão de letra
BR112017026743A2 (pt) aparelho e método de decodificação, programa, e, aparelho e método de codificação
PH12019501169A1 (en) Multi-blockchain network data processing method, apparatus, and server
EP3816886A4 (en) METHOD, APPARATUS, MANAGEMENT SYSTEM APPLIED TO A CUSTOMER GOODS SYSTEM, AND COMPUTER STORAGE SERVER AND MEDIUM
BR112019004798A8 (pt) Método implantado por computador e mídia de armazenamento
MX2016014234A (es) Sistema y metodo para la creacion y uso de diseños dinamicos de calidad alta visualmente diversos.
EP2953065A3 (en) Generating representations of input sequences using neural networks
EP4383136A3 (en) Population based training of neural networks
WO2018038385A3 (ko) 음성 인식 방법 및 이를 수행하는 전자 장치
MX2017015844A (es) Sistema y metodo para la generacion de una interfaz de usuario adaptable en un sistema de construccion de sitios web.
JP2015092654A5 (pt)
BR112018071019A2 (pt) aparelho e método para fornecer zonas individuais de som
MY201297A (en) Audio generation method, server, and storage medium
MX2016015331A (es) Dispositivo de procesamiento de informacion y metodo de procesamiento de informacion.
GB2545070A (en) Generating molecular encoding information for data storage
BR112016015028A2 (pt) Aparelho e método para geração de uma pluralidade de canais de áudio
BR112013028676A2 (pt) método para compor alterações de configuração em um elemento de rede
BR112023020614A2 (pt) Processamento de entradas multimodais usando modelos de linguagem
AU2014410705A1 (en) Data processing method and apparatus
BR112014022060A2 (pt) Método, e sistema de computação
GB2550732A (en) A system, content editing server, audio recording slave device and content editing interface for distributed live performance scheduled audio recording, cloud

Legal Events

Date Code Title Description
B350 Update of information on the portal [chapter 15.35 patent gazette]
B06W Patent application suspended after preliminary examination (for patents with searches from other patent authorities) chapter 6.23 patent gazette]
B15K Others concerning applications: alteration of classification

Free format text: A CLASSIFICACAO ANTERIOR ERA: G06N 3/04

Ipc: G06N 3/045 (2023.01), G06N 3/048 (2023.01), G10L 1

B09A Decision: intention to grant [chapter 9.1 patent gazette]
B16A Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]

Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 06/09/2017, OBSERVADAS AS CONDICOES LEGAIS