BR112019004524A2 - geração de áudio usando redes nuerais - Google Patents
geração de áudio usando redes nueraisInfo
- Publication number
- BR112019004524A2 BR112019004524A2 BR112019004524A BR112019004524A BR112019004524A2 BR 112019004524 A2 BR112019004524 A2 BR 112019004524A2 BR 112019004524 A BR112019004524 A BR 112019004524A BR 112019004524 A BR112019004524 A BR 112019004524A BR 112019004524 A2 BR112019004524 A2 BR 112019004524A2
- Authority
- BR
- Brazil
- Prior art keywords
- time
- output
- audio data
- audio
- alternative representation
- Prior art date
Links
- 238000000034 method Methods 0.000 abstract 4
- 238000004590 computer program Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/311—Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Electrically Operated Instructional Devices (AREA)
- Complex Calculations (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
métodos, sistemas e aparelhos, incluindo programas de computador codificados em meios de armazenamento de computador, para gerar uma sequência de saída de dados de áudio que compreende uma amostra de áudio respectiva em cada uma de uma pluralidade de etapas de tempo. um dos métodos inclui, para cada uma das etapas de tempo: fornecer uma sequência atual de dados de áudio como entrada para uma sub a rede convolucional, em que a sequência atual compreende a respectiva amostra de áudio em cada etapa de tempo que precede o intervalo de tempo na sequência de saída, e em que a sub a rede convolucional é configurada para processar a sequência atual de dados de áudio para gerar uma representação alternativa para o intervalo de tempo; e proporcionar a representação alternativa para o passo de tempo como entrada para uma camada de saída, em que a camada de saída é configurada para: processar a representação alternativa para gerar uma saída que define uma distribuição de pontuações ao longo de uma pluralidade de amostras de áudio possíveis para o intervalo de tempo.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662384115P | 2016-09-06 | 2016-09-06 | |
US62/384,115 | 2016-09-06 | ||
PCT/US2017/050320 WO2018048934A1 (en) | 2016-09-06 | 2017-09-06 | Generating audio using neural networks |
Publications (2)
Publication Number | Publication Date |
---|---|
BR112019004524A2 true BR112019004524A2 (pt) | 2019-05-28 |
BR112019004524B1 BR112019004524B1 (pt) | 2023-11-07 |
Family
ID=60022154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
BR112019004524-4A BR112019004524B1 (pt) | 2016-09-06 | 2017-09-06 | Sistema de redes neurais, um ou mais meios de armazenamento legíveis por computador não transitório e método para gerar autorregressivamente uma sequência de saída de dados de áudio |
Country Status (9)
Country | Link |
---|---|
US (5) | US10304477B2 (pt) |
EP (2) | EP3497629B1 (pt) |
JP (3) | JP6577159B1 (pt) |
KR (1) | KR102353284B1 (pt) |
CN (2) | CN112289342B (pt) |
AU (1) | AU2017324937B2 (pt) |
BR (1) | BR112019004524B1 (pt) |
CA (2) | CA3155320A1 (pt) |
WO (1) | WO2018048934A1 (pt) |
Families Citing this family (85)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9965247B2 (en) | 2016-02-22 | 2018-05-08 | Sonos, Inc. | Voice controlled media playback system based on user profile |
US9811314B2 (en) | 2016-02-22 | 2017-11-07 | Sonos, Inc. | Metadata exchange involving a networked playback system and a networked microphone system |
US10095470B2 (en) | 2016-02-22 | 2018-10-09 | Sonos, Inc. | Audio response playback |
US9772817B2 (en) | 2016-02-22 | 2017-09-26 | Sonos, Inc. | Room-corrected voice detection |
US10264030B2 (en) | 2016-02-22 | 2019-04-16 | Sonos, Inc. | Networked microphone device control |
US9947316B2 (en) | 2016-02-22 | 2018-04-17 | Sonos, Inc. | Voice control of a media playback system |
US9978390B2 (en) | 2016-06-09 | 2018-05-22 | Sonos, Inc. | Dynamic player selection for audio signal processing |
US10134399B2 (en) | 2016-07-15 | 2018-11-20 | Sonos, Inc. | Contextualization of voice inputs |
US10115400B2 (en) | 2016-08-05 | 2018-10-30 | Sonos, Inc. | Multiple voice services |
EP3497629B1 (en) | 2016-09-06 | 2020-11-04 | Deepmind Technologies Limited | Generating audio using neural networks |
US11080591B2 (en) * | 2016-09-06 | 2021-08-03 | Deepmind Technologies Limited | Processing sequences using convolutional neural networks |
WO2018048945A1 (en) * | 2016-09-06 | 2018-03-15 | Deepmind Technologies Limited | Processing sequences using convolutional neural networks |
US9942678B1 (en) | 2016-09-27 | 2018-04-10 | Sonos, Inc. | Audio playback settings for voice interaction |
US10181323B2 (en) | 2016-10-19 | 2019-01-15 | Sonos, Inc. | Arbitration-based voice recognition |
CN110023963B (zh) | 2016-10-26 | 2023-05-30 | 渊慧科技有限公司 | 使用神经网络处理文本序列 |
EP3745394B1 (en) * | 2017-03-29 | 2023-05-10 | Google LLC | End-to-end text-to-speech conversion |
US10475449B2 (en) | 2017-08-07 | 2019-11-12 | Sonos, Inc. | Wake-word detection suppression |
KR102410820B1 (ko) * | 2017-08-14 | 2022-06-20 | 삼성전자주식회사 | 뉴럴 네트워크를 이용한 인식 방법 및 장치 및 상기 뉴럴 네트워크를 트레이닝하는 방법 및 장치 |
JP7209275B2 (ja) * | 2017-08-31 | 2023-01-20 | 国立研究開発法人情報通信研究機構 | オーディオデータ学習装置、オーディオデータ推論装置、およびプログラム |
US10048930B1 (en) | 2017-09-08 | 2018-08-14 | Sonos, Inc. | Dynamic computation of system response volume |
US10446165B2 (en) | 2017-09-27 | 2019-10-15 | Sonos, Inc. | Robust short-time fourier transform acoustic echo cancellation during audio playback |
US10482868B2 (en) | 2017-09-28 | 2019-11-19 | Sonos, Inc. | Multi-channel acoustic echo cancellation |
US10621981B2 (en) | 2017-09-28 | 2020-04-14 | Sonos, Inc. | Tone interference cancellation |
US10466962B2 (en) | 2017-09-29 | 2019-11-05 | Sonos, Inc. | Media playback system with voice assistance |
US10880650B2 (en) | 2017-12-10 | 2020-12-29 | Sonos, Inc. | Network microphone devices with automatic do not disturb actuation capabilities |
JP7082357B2 (ja) * | 2018-01-11 | 2022-06-08 | ネオサピエンス株式会社 | 機械学習を利用したテキスト音声合成方法、装置およびコンピュータ読み取り可能な記憶媒体 |
WO2019152722A1 (en) | 2018-01-31 | 2019-08-08 | Sonos, Inc. | Device designation of playback and network microphone device arrangements |
US11175880B2 (en) | 2018-05-10 | 2021-11-16 | Sonos, Inc. | Systems and methods for voice-assisted media content selection |
US10959029B2 (en) | 2018-05-25 | 2021-03-23 | Sonos, Inc. | Determining and adapting to changes in microphone performance of playback devices |
CA3103470A1 (en) | 2018-06-12 | 2019-12-19 | Intergraph Corporation | Artificial intelligence applications for computer-aided dispatch systems |
US10681460B2 (en) | 2018-06-28 | 2020-06-09 | Sonos, Inc. | Systems and methods for associating playback devices with voice assistant services |
US10971170B2 (en) * | 2018-08-08 | 2021-04-06 | Google Llc | Synthesizing speech from text using neural networks |
US10461710B1 (en) | 2018-08-28 | 2019-10-29 | Sonos, Inc. | Media playback system with maximum volume setting |
US11076035B2 (en) | 2018-08-28 | 2021-07-27 | Sonos, Inc. | Do not disturb feature for audio notifications |
US10587430B1 (en) | 2018-09-14 | 2020-03-10 | Sonos, Inc. | Networked devices, systems, and methods for associating playback devices based on sound codes |
US11024331B2 (en) | 2018-09-21 | 2021-06-01 | Sonos, Inc. | Voice detection optimization using sound metadata |
US10811015B2 (en) | 2018-09-25 | 2020-10-20 | Sonos, Inc. | Voice detection optimization based on selected voice assistant service |
CN112789625A (zh) | 2018-09-27 | 2021-05-11 | 渊慧科技有限公司 | 承诺信息速率变分自编码器 |
US11100923B2 (en) | 2018-09-28 | 2021-08-24 | Sonos, Inc. | Systems and methods for selective wake word detection using neural network models |
US11899519B2 (en) | 2018-10-23 | 2024-02-13 | Sonos, Inc. | Multiple stage network microphone device with reduced power consumption and processing load |
US10977872B2 (en) | 2018-10-31 | 2021-04-13 | Sony Interactive Entertainment Inc. | Graphical style modification for video games using machine learning |
US11636673B2 (en) | 2018-10-31 | 2023-04-25 | Sony Interactive Entertainment Inc. | Scene annotation using machine learning |
US11375293B2 (en) | 2018-10-31 | 2022-06-28 | Sony Interactive Entertainment Inc. | Textual annotation of acoustic effects |
US10854109B2 (en) | 2018-10-31 | 2020-12-01 | Sony Interactive Entertainment Inc. | Color accommodation for on-demand accessibility |
EP3654249A1 (en) * | 2018-11-15 | 2020-05-20 | Snips | Dilated convolutions and gating for efficient keyword spotting |
US11024321B2 (en) | 2018-11-30 | 2021-06-01 | Google Llc | Speech coding using auto-regressive generative neural networks |
US11183183B2 (en) | 2018-12-07 | 2021-11-23 | Sonos, Inc. | Systems and methods of operating media playback systems having multiple voice assistant services |
US11132989B2 (en) | 2018-12-13 | 2021-09-28 | Sonos, Inc. | Networked microphone devices, systems, and methods of localized arbitration |
CN109771944B (zh) * | 2018-12-19 | 2022-07-12 | 武汉西山艺创文化有限公司 | 一种游戏音效生成方法、装置、设备和存储介质 |
US10602268B1 (en) | 2018-12-20 | 2020-03-24 | Sonos, Inc. | Optimization of network microphone devices using noise classification |
JP7192882B2 (ja) * | 2018-12-26 | 2022-12-20 | 日本電信電話株式会社 | 発話リズム変換装置、モデル学習装置、それらの方法、およびプログラム |
US11315556B2 (en) | 2019-02-08 | 2022-04-26 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing by transmitting sound data associated with a wake word to an appropriate device for identification |
US10867604B2 (en) | 2019-02-08 | 2020-12-15 | Sonos, Inc. | Devices, systems, and methods for distributed voice processing |
US11587552B2 (en) | 2019-04-30 | 2023-02-21 | Sutherland Global Services Inc. | Real time key conversational metrics prediction and notability |
US11120794B2 (en) | 2019-05-03 | 2021-09-14 | Sonos, Inc. | Voice assistant persistence across multiple network microphone devices |
CN110136731B (zh) * | 2019-05-13 | 2021-12-24 | 天津大学 | 空洞因果卷积生成对抗网络端到端骨导语音盲增强方法 |
CN113874934A (zh) * | 2019-05-23 | 2021-12-31 | 谷歌有限责任公司 | 有表达力的端到端语音合成中的变分嵌入容量 |
US11200894B2 (en) | 2019-06-12 | 2021-12-14 | Sonos, Inc. | Network microphone device with command keyword eventing |
US10586540B1 (en) | 2019-06-12 | 2020-03-10 | Sonos, Inc. | Network microphone device with command keyword conditioning |
US11361756B2 (en) | 2019-06-12 | 2022-06-14 | Sonos, Inc. | Conditional wake word eventing based on environment |
US10871943B1 (en) | 2019-07-31 | 2020-12-22 | Sonos, Inc. | Noise classification for event detection |
US11138969B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
US11138975B2 (en) | 2019-07-31 | 2021-10-05 | Sonos, Inc. | Locally distributed keyword detection |
CN110728991B (zh) * | 2019-09-06 | 2022-03-01 | 南京工程学院 | 一种改进的录音设备识别算法 |
WO2021075994A1 (en) | 2019-10-16 | 2021-04-22 | Saudi Arabian Oil Company | Determination of elastic properties of a geological formation using machine learning applied to data acquired while drilling |
US11189286B2 (en) | 2019-10-22 | 2021-11-30 | Sonos, Inc. | VAS toggle based on device orientation |
KR20210048310A (ko) | 2019-10-23 | 2021-05-03 | 삼성전자주식회사 | 전자 장치 및 그 제어 방법 |
KR102556096B1 (ko) * | 2019-11-29 | 2023-07-18 | 한국전자통신연구원 | 이전 프레임의 정보를 사용한 오디오 신호 부호화/복호화 장치 및 방법 |
US11200900B2 (en) | 2019-12-20 | 2021-12-14 | Sonos, Inc. | Offline voice control |
US11373095B2 (en) * | 2019-12-23 | 2022-06-28 | Jens C. Jenkins | Machine learning multiple features of depicted item |
US11562740B2 (en) | 2020-01-07 | 2023-01-24 | Sonos, Inc. | Voice verification for media playback |
US11308958B2 (en) | 2020-02-07 | 2022-04-19 | Sonos, Inc. | Localized wakeword verification |
US20210312258A1 (en) * | 2020-04-01 | 2021-10-07 | Sony Corporation | Computing temporal convolution networks in real time |
US20210350788A1 (en) * | 2020-05-06 | 2021-11-11 | Samsung Electronics Co., Ltd. | Electronic device for generating speech signal corresponding to at least one text and operating method of the electronic device |
US11308962B2 (en) | 2020-05-20 | 2022-04-19 | Sonos, Inc. | Input detection windowing |
US11482224B2 (en) | 2020-05-20 | 2022-10-25 | Sonos, Inc. | Command keywords with input detection windowing |
US11727919B2 (en) | 2020-05-20 | 2023-08-15 | Sonos, Inc. | Memory allocation for keyword spotting engines |
EP3719711A3 (en) | 2020-07-30 | 2021-03-03 | Institutul Roman De Stiinta Si Tehnologie | Method of detecting anomalous data, machine computing unit, computer program |
US11698771B2 (en) | 2020-08-25 | 2023-07-11 | Sonos, Inc. | Vocal guidance engines for playback devices |
US11984123B2 (en) | 2020-11-12 | 2024-05-14 | Sonos, Inc. | Network device interaction by range |
US11796714B2 (en) | 2020-12-10 | 2023-10-24 | Saudi Arabian Oil Company | Determination of mechanical properties of a geological formation using deep learning applied to data acquired while drilling |
CN113724683B (zh) * | 2021-07-23 | 2024-03-22 | 阿里巴巴达摩院(杭州)科技有限公司 | 音频生成方法、计算机设备及计算机可读存储介质 |
WO2023177145A1 (ko) * | 2022-03-16 | 2023-09-21 | 삼성전자주식회사 | 전자 장치 및 전자 장치의 제어 방법 |
WO2023219292A1 (ko) * | 2022-05-09 | 2023-11-16 | 삼성전자 주식회사 | 장면 분류를 위한 오디오 처리 방법 및 장치 |
EP4293662A1 (en) * | 2022-06-17 | 2023-12-20 | Samsung Electronics Co., Ltd. | Method and system for personalising machine learning models |
Family Cites Families (72)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US2810457A (en) * | 1953-04-10 | 1957-10-22 | Gen Motors Corp | Lubricator |
JPH0450121Y2 (pt) | 1986-04-30 | 1992-11-26 | ||
JP2522400B2 (ja) * | 1989-08-10 | 1996-08-07 | ヤマハ株式会社 | 楽音波形生成方法 |
US5377302A (en) | 1992-09-01 | 1994-12-27 | Monowave Corporation L.P. | System for recognizing speech |
AU675389B2 (en) * | 1994-04-28 | 1997-01-30 | Motorola, Inc. | A method and apparatus for converting text into audible signals using a neural network |
JP3270668B2 (ja) * | 1995-10-31 | 2002-04-02 | ナショナル サイエンス カウンシル | テキストからスピーチへの人工的ニューラルネットワークに基づく韻律の合成装置 |
US6357176B2 (en) * | 1997-03-19 | 2002-03-19 | Mississippi State University | Soilless sod |
JPH10333699A (ja) * | 1997-06-05 | 1998-12-18 | Fujitsu Ltd | 音声認識および音声合成装置 |
US5913194A (en) * | 1997-07-14 | 1999-06-15 | Motorola, Inc. | Method, device and system for using statistical information to reduce computation and memory requirements of a neural network based speech synthesis system |
JPH11282484A (ja) * | 1998-03-27 | 1999-10-15 | Victor Co Of Japan Ltd | 音声合成装置 |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
DE10018134A1 (de) * | 2000-04-12 | 2001-10-18 | Siemens Ag | Verfahren und Vorrichtung zum Bestimmen prosodischer Markierungen |
JP2002123280A (ja) * | 2000-10-16 | 2002-04-26 | Seiko Epson Corp | 音声合成方法および音声合成装置ならびに音声合成処理プログラムを記録した記録媒体 |
US7062437B2 (en) * | 2001-02-13 | 2006-06-13 | International Business Machines Corporation | Audio renderings for expressing non-audio nuances |
US20060064177A1 (en) | 2004-09-17 | 2006-03-23 | Nokia Corporation | System and method for measuring confusion among words in an adaptive speech recognition system |
US7747070B2 (en) * | 2005-08-31 | 2010-06-29 | Microsoft Corporation | Training convolutional neural networks on graphics processing units |
KR100832556B1 (ko) * | 2006-09-22 | 2008-05-26 | (주)한국파워보이스 | 강인한 원거리 음성 인식 시스템을 위한 음성 인식 방법 |
US8504361B2 (en) * | 2008-02-07 | 2013-08-06 | Nec Laboratories America, Inc. | Deep neural networks and methods for using same |
CA2724753A1 (en) * | 2008-05-30 | 2009-12-03 | Nokia Corporation | Method, apparatus and computer program product for providing improved speech synthesis |
FR2950713A1 (fr) | 2009-09-29 | 2011-04-01 | Movea Sa | Systeme et procede de reconnaissance de gestes |
TWI413104B (zh) * | 2010-12-22 | 2013-10-21 | Ind Tech Res Inst | 可調控式韻律重估測系統與方法及電腦程式產品 |
CN102651217A (zh) * | 2011-02-25 | 2012-08-29 | 株式会社东芝 | 用于合成语音的方法、设备以及用于语音合成的声学模型训练方法 |
EP2565667A1 (en) | 2011-08-31 | 2013-03-06 | Friedrich-Alexander-Universität Erlangen-Nürnberg | Direction of arrival estimation using watermarked audio signals and microphone arrays |
US8527276B1 (en) * | 2012-10-25 | 2013-09-03 | Google Inc. | Speech synthesis using deep neural networks |
US9230550B2 (en) * | 2013-01-10 | 2016-01-05 | Sensory, Incorporated | Speaker verification and identification using artificial neural network-based sub-phonetic unit discrimination |
US9147154B2 (en) | 2013-03-13 | 2015-09-29 | Google Inc. | Classifying resources using a deep network |
US9141906B2 (en) * | 2013-03-13 | 2015-09-22 | Google Inc. | Scoring concept terms using a deep network |
CA2810457C (en) * | 2013-03-25 | 2018-11-20 | Gerald Bradley PENN | System and method for applying a convolutional neural network to speech recognition |
US9190053B2 (en) | 2013-03-25 | 2015-11-17 | The Governing Council Of The Univeristy Of Toronto | System and method for applying a convolutional neural network to speech recognition |
US20150032449A1 (en) * | 2013-07-26 | 2015-01-29 | Nuance Communications, Inc. | Method and Apparatus for Using Convolutional Neural Networks in Speech Recognition |
CN104681034A (zh) * | 2013-11-27 | 2015-06-03 | 杜比实验室特许公司 | 音频信号处理 |
US9953634B1 (en) | 2013-12-17 | 2018-04-24 | Knowles Electronics, Llc | Passive training for automatic speech recognition |
US10275704B2 (en) | 2014-06-06 | 2019-04-30 | Google Llc | Generating representations of input sequences using neural networks |
US10181098B2 (en) | 2014-06-06 | 2019-01-15 | Google Llc | Generating representations of input sequences using neural networks |
KR102332729B1 (ko) | 2014-07-28 | 2021-11-30 | 삼성전자주식회사 | 발음 유사도를 기반으로 한 음성 인식 방법 및 장치, 음성 인식 엔진 생성 방법 및 장치 |
US9821340B2 (en) * | 2014-07-28 | 2017-11-21 | Kolo Medical Ltd. | High displacement ultrasonic transducer |
US20160035344A1 (en) * | 2014-08-04 | 2016-02-04 | Google Inc. | Identifying the language of a spoken utterance |
CN110110843B (zh) | 2014-08-29 | 2020-09-25 | 谷歌有限责任公司 | 用于处理图像的方法和系统 |
JP6814146B2 (ja) * | 2014-09-25 | 2021-01-13 | サンハウス・テクノロジーズ・インコーポレーテッド | オーディオをキャプチャーし、解釈するシステムと方法 |
US10783900B2 (en) * | 2014-10-03 | 2020-09-22 | Google Llc | Convolutional, long short-term memory, fully connected deep neural networks |
US9824684B2 (en) | 2014-11-13 | 2017-11-21 | Microsoft Technology Licensing, Llc | Prediction-based sequence recognition |
US9542927B2 (en) * | 2014-11-13 | 2017-01-10 | Google Inc. | Method and system for building text-to-speech voice from diverse recordings |
US9607217B2 (en) * | 2014-12-22 | 2017-03-28 | Yahoo! Inc. | Generating preference indices for image content |
US11080587B2 (en) * | 2015-02-06 | 2021-08-03 | Deepmind Technologies Limited | Recurrent neural networks for data item generation |
US10403269B2 (en) * | 2015-03-27 | 2019-09-03 | Google Llc | Processing audio waveforms |
US20160343366A1 (en) * | 2015-05-19 | 2016-11-24 | Google Inc. | Speech synthesis model selection |
US9595002B2 (en) | 2015-05-29 | 2017-03-14 | Sas Institute Inc. | Normalizing electronic communications using a vector having a repeating substring as input for a neural network |
CN105096939B (zh) * | 2015-07-08 | 2017-07-25 | 百度在线网络技术(北京)有限公司 | 语音唤醒方法和装置 |
US9786270B2 (en) | 2015-07-09 | 2017-10-10 | Google Inc. | Generating acoustic models |
CN106375231B (zh) * | 2015-07-22 | 2019-11-05 | 华为技术有限公司 | 一种流量切换方法、设备及系统 |
KR102413692B1 (ko) | 2015-07-24 | 2022-06-27 | 삼성전자주식회사 | 음성 인식을 위한 음향 점수 계산 장치 및 방법, 음성 인식 장치 및 방법, 전자 장치 |
CN105068998B (zh) | 2015-07-29 | 2017-12-15 | 百度在线网络技术(北京)有限公司 | 基于神经网络模型的翻译方法及装置 |
CN105321525B (zh) * | 2015-09-30 | 2019-02-22 | 北京邮电大学 | 一种降低voip通信资源开销的系统和方法 |
US10733979B2 (en) | 2015-10-09 | 2020-08-04 | Google Llc | Latency constraints for acoustic modeling |
US10395118B2 (en) | 2015-10-29 | 2019-08-27 | Baidu Usa Llc | Systems and methods for video paragraph captioning using hierarchical recurrent neural networks |
WO2017083695A1 (en) * | 2015-11-12 | 2017-05-18 | Google Inc. | Generating target sequences from input sequences using partial conditioning |
US10319374B2 (en) | 2015-11-25 | 2019-06-11 | Baidu USA, LLC | Deployed end-to-end speech recognition |
CN105513591B (zh) * | 2015-12-21 | 2019-09-03 | 百度在线网络技术(北京)有限公司 | 用lstm循环神经网络模型进行语音识别的方法和装置 |
US10402700B2 (en) | 2016-01-25 | 2019-09-03 | Deepmind Technologies Limited | Generating images using neural networks |
CN108780519B (zh) * | 2016-03-11 | 2022-09-02 | 奇跃公司 | 卷积神经网络的结构学习 |
US10460747B2 (en) | 2016-05-10 | 2019-10-29 | Google Llc | Frequency based audio analysis using neural networks |
US9972314B2 (en) | 2016-06-01 | 2018-05-15 | Microsoft Technology Licensing, Llc | No loss-optimization for weighted transducer |
US11373672B2 (en) | 2016-06-14 | 2022-06-28 | The Trustees Of Columbia University In The City Of New York | Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments |
US9984683B2 (en) | 2016-07-22 | 2018-05-29 | Google Llc | Automatic speech recognition using multi-dimensional models |
WO2018048945A1 (en) | 2016-09-06 | 2018-03-15 | Deepmind Technologies Limited | Processing sequences using convolutional neural networks |
EP3497629B1 (en) * | 2016-09-06 | 2020-11-04 | Deepmind Technologies Limited | Generating audio using neural networks |
US11080591B2 (en) | 2016-09-06 | 2021-08-03 | Deepmind Technologies Limited | Processing sequences using convolutional neural networks |
CN110023963B (zh) | 2016-10-26 | 2023-05-30 | 渊慧科技有限公司 | 使用神经网络处理文本序列 |
US10049106B2 (en) | 2017-01-18 | 2018-08-14 | Xerox Corporation | Natural language generation through character-based recurrent neural networks with finite-state prior knowledge |
US11934935B2 (en) | 2017-05-20 | 2024-03-19 | Deepmind Technologies Limited | Feedforward generative neural networks |
US10726858B2 (en) | 2018-06-22 | 2020-07-28 | Intel Corporation | Neural network for speech denoising trained with deep feature losses |
US10971170B2 (en) | 2018-08-08 | 2021-04-06 | Google Llc | Synthesizing speech from text using neural networks |
-
2017
- 2017-09-06 EP EP17780543.9A patent/EP3497629B1/en active Active
- 2017-09-06 CA CA3155320A patent/CA3155320A1/en active Pending
- 2017-09-06 CA CA3036067A patent/CA3036067C/en active Active
- 2017-09-06 WO PCT/US2017/050320 patent/WO2018048934A1/en unknown
- 2017-09-06 BR BR112019004524-4A patent/BR112019004524B1/pt active IP Right Grant
- 2017-09-06 AU AU2017324937A patent/AU2017324937B2/en active Active
- 2017-09-06 JP JP2019522236A patent/JP6577159B1/ja active Active
- 2017-09-06 KR KR1020197009838A patent/KR102353284B1/ko active IP Right Grant
- 2017-09-06 EP EP20195353.6A patent/EP3822863B1/en active Active
- 2017-09-06 CN CN202011082855.5A patent/CN112289342B/zh active Active
- 2017-09-06 CN CN201780065523.6A patent/CN109891434B/zh active Active
-
2018
- 2018-07-09 US US16/030,742 patent/US10304477B2/en active Active
-
2019
- 2019-04-22 US US16/390,549 patent/US10803884B2/en active Active
- 2019-08-20 JP JP2019150456A patent/JP6891236B2/ja active Active
-
2020
- 2020-09-14 US US17/020,348 patent/US11386914B2/en active Active
-
2021
- 2021-05-25 JP JP2021087708A patent/JP7213913B2/ja active Active
-
2022
- 2022-06-13 US US17/838,985 patent/US11869530B2/en active Active
-
2023
- 2023-11-27 US US18/519,986 patent/US20240135955A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
CN109891434B (zh) | 2020-10-30 |
EP3497629A1 (en) | 2019-06-19 |
CN109891434A (zh) | 2019-06-14 |
US11386914B2 (en) | 2022-07-12 |
JP2021152664A (ja) | 2021-09-30 |
KR20190042730A (ko) | 2019-04-24 |
US20220319533A1 (en) | 2022-10-06 |
US20240135955A1 (en) | 2024-04-25 |
JP6891236B2 (ja) | 2021-06-18 |
JP7213913B2 (ja) | 2023-01-27 |
EP3822863A1 (en) | 2021-05-19 |
AU2017324937A1 (en) | 2019-03-28 |
AU2017324937B2 (en) | 2019-12-19 |
BR112019004524B1 (pt) | 2023-11-07 |
EP3822863B1 (en) | 2022-11-02 |
CA3036067C (en) | 2023-08-01 |
KR102353284B1 (ko) | 2022-01-19 |
CN112289342B (zh) | 2024-03-19 |
US20180322891A1 (en) | 2018-11-08 |
JP2020003809A (ja) | 2020-01-09 |
WO2018048934A1 (en) | 2018-03-15 |
CA3036067A1 (en) | 2018-03-15 |
US20190251987A1 (en) | 2019-08-15 |
US10304477B2 (en) | 2019-05-28 |
JP6577159B1 (ja) | 2019-09-18 |
US11869530B2 (en) | 2024-01-09 |
US20200411032A1 (en) | 2020-12-31 |
US10803884B2 (en) | 2020-10-13 |
CA3155320A1 (en) | 2018-03-15 |
EP3497629B1 (en) | 2020-11-04 |
CN112289342A (zh) | 2021-01-29 |
JP2019532349A (ja) | 2019-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
BR112019004524A2 (pt) | geração de áudio usando redes nuerais | |
SG10201707702YA (en) | Collaborative Voice Controlled Devices | |
BR112019000541A2 (pt) | métodos de superpixel para redes neurais convolucionais | |
AR125774A2 (es) | Procesador de datos de audio para decodificadores de audio y/o renderizadores y método para procesar datos de audio | |
BR112018006194A2 (pt) | sistema automatizado de composição e geração de música, processo automatizado de composição e geração de música, composição e geração automatizadas de música, instrumento musical de brinquedo, instrumento de brinquedo de composição de música e acompanhamento de vídeo, sistema automatizado de instrumento de brinquedo de composição e geração de música, sistema eletrônico de processamento e exibição de informações, sistema de composição e geração de música baseado em internet de nível empresarial, rede de sistema para gerar e entregar músicas digitais compostas automaticamente, sistema autônomo de composição e performance de música baseado em inteligência artificial para uso em um ambiente musical, processo autônomo de composição geração e performance de música baseado em inteligência artificial, sistema autônomo de instrumentos de análise, rede para configurar um mecanismo automatizado de composição e geração de música, método de gerenciamento de parâmetros operacionais do sistema de teoria musical, método de compor e gerar uma música digital de forma automatizada, subsistema de mecanismo de transformação de parâmetro, método de compor e gerar música, método de processamento de expressão de letra e método de processamento da entrada de expressão de letra | |
BR112017026743A2 (pt) | aparelho e método de decodificação, programa, e, aparelho e método de codificação | |
BR112019004798A8 (pt) | Método implantado por computador e mídia de armazenamento | |
BR112019024578A2 (pt) | dispositivo servidor de resposta automatizado, dispositivo terminal, sistema de resposta, método de resposta e programa | |
MX2017015844A (es) | Sistema y metodo para la generacion de una interfaz de usuario adaptable en un sistema de construccion de sitios web. | |
WO2018038385A3 (ko) | 음성 인식 방법 및 이를 수행하는 전자 장치 | |
BR112018001230A2 (pt) | aprendizagem de transferência em redes neurais | |
BR112018071019A2 (pt) | aparelho e método para fornecer zonas individuais de som | |
MX2019013620A (es) | Dispositivo de procesamiento de informacion y metodo de procesamiento de informacion. | |
MY201297A (en) | Audio generation method, server, and storage medium | |
MX2016011079A (es) | Generalizador de certificacion de conduccion autonoma. | |
AR050394A1 (es) | Metodos y aparato para proveer credenciales de aplicaciones | |
GB2545070A (en) | Generating molecular encoding information for data storage | |
BR112016015028A2 (pt) | Aparelho e método para geração de uma pluralidade de canais de áudio | |
MX361806B (es) | Métodos y sistemas para recomendar configuraciones de comunicación. | |
BR112013028676A2 (pt) | método para compor alterações de configuração em um elemento de rede | |
BR112023020614A2 (pt) | Processamento de entradas multimodais usando modelos de linguagem | |
ATE542218T1 (de) | Audioinformationsverarbeitungsgerät, audioinformationsverarbeitungsverfahren und dazugehöriges computer-programm | |
AU2014410705A1 (en) | Data processing method and apparatus | |
BR112014022060A2 (pt) | Método, e sistema de computação | |
BR112018074368A2 (pt) | método e aparelho para converter dados de cor em notas musicais |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
B350 | Update of information on the portal [chapter 15.35 patent gazette] | ||
B06W | Patent application suspended after preliminary examination (for patents with searches from other patent authorities) chapter 6.23 patent gazette] | ||
B15K | Others concerning applications: alteration of classification |
Free format text: A CLASSIFICACAO ANTERIOR ERA: G06N 3/04 Ipc: G06N 3/045 (2023.01), G06N 3/048 (2023.01), G10L 1 |
|
B09A | Decision: intention to grant [chapter 9.1 patent gazette] | ||
B16A | Patent or certificate of addition of invention granted [chapter 16.1 patent gazette] |
Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 06/09/2017, OBSERVADAS AS CONDICOES LEGAIS |