CN111640442A - 处理音频丢包的方法、训练神经网络的方法及各自的装置 - Google Patents
处理音频丢包的方法、训练神经网络的方法及各自的装置 Download PDFInfo
- Publication number
- CN111640442A CN111640442A CN202010486267.1A CN202010486267A CN111640442A CN 111640442 A CN111640442 A CN 111640442A CN 202010486267 A CN202010486267 A CN 202010486267A CN 111640442 A CN111640442 A CN 111640442A
- Authority
- CN
- China
- Prior art keywords
- information
- neural network
- audio
- packet loss
- phase
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 229
- 238000000034 method Methods 0.000 title claims abstract description 71
- 238000012549 training Methods 0.000 title claims abstract description 40
- 238000012545 processing Methods 0.000 title abstract description 30
- 230000003993 interaction Effects 0.000 claims abstract description 52
- 238000013527 convolutional neural network Methods 0.000 claims description 33
- 230000009466 transformation Effects 0.000 claims description 21
- 238000002156 mixing Methods 0.000 claims description 16
- 238000012937 correction Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 7
- 230000004048 modification Effects 0.000 claims description 7
- 238000012986 modification Methods 0.000 claims description 7
- 238000006073 displacement reaction Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 16
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000001228 spectrum Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 230000001364 causal effect Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000005562 fading Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004984 smart glass Substances 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/005—Correction of errors induced by the transmission channel, if related to the coding algorithm
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
Claims (17)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010486267.1A CN111640442B (zh) | 2020-06-01 | 2020-06-01 | 处理音频丢包的方法、训练神经网络的方法及各自的装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010486267.1A CN111640442B (zh) | 2020-06-01 | 2020-06-01 | 处理音频丢包的方法、训练神经网络的方法及各自的装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111640442A true CN111640442A (zh) | 2020-09-08 |
CN111640442B CN111640442B (zh) | 2023-05-23 |
Family
ID=72332340
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010486267.1A Active CN111640442B (zh) | 2020-06-01 | 2020-06-01 | 处理音频丢包的方法、训练神经网络的方法及各自的装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111640442B (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112634912A (zh) * | 2020-12-18 | 2021-04-09 | 北京猿力未来科技有限公司 | 丢包补偿方法及装置 |
CN113035211A (zh) * | 2021-03-11 | 2021-06-25 | 马上消费金融股份有限公司 | 音频压缩方法、音频解压缩方法及装置 |
JP7490894B2 (ja) | 2020-10-15 | 2024-05-27 | ドルビー・インターナショナル・アーベー | 深層生成ネットワークを用いたリアルタイムパケット損失隠蔽 |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014012391A1 (zh) * | 2012-07-18 | 2014-01-23 | 华为技术有限公司 | 一种语音数据丢包的补偿方法及装置 |
CN104347076A (zh) * | 2013-08-09 | 2015-02-11 | 中国电信股份有限公司 | 网络音频丢包掩蔽方法和装置 |
US20150248893A1 (en) * | 2014-02-28 | 2015-09-03 | Google Inc. | Sinusoidal interpolation across missing data |
CN108171222A (zh) * | 2018-02-11 | 2018-06-15 | 清华大学 | 一种基于多流神经网络的实时视频分类方法及装置 |
US10127918B1 (en) * | 2017-05-03 | 2018-11-13 | Amazon Technologies, Inc. | Methods for reconstructing an audio signal |
CN109117777A (zh) * | 2018-08-03 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | 生成信息的方法和装置 |
CN109218083A (zh) * | 2018-08-27 | 2019-01-15 | 广州爱拍网络科技有限公司 | 一种语音数据传输方法及装置 |
CN110392273A (zh) * | 2019-07-16 | 2019-10-29 | 北京达佳互联信息技术有限公司 | 音视频处理的方法、装置、电子设备及存储介质 |
CN110491407A (zh) * | 2019-08-15 | 2019-11-22 | 广州华多网络科技有限公司 | 语音降噪的方法、装置、电子设备及存储介质 |
-
2020
- 2020-06-01 CN CN202010486267.1A patent/CN111640442B/zh active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014012391A1 (zh) * | 2012-07-18 | 2014-01-23 | 华为技术有限公司 | 一种语音数据丢包的补偿方法及装置 |
US20150131429A1 (en) * | 2012-07-18 | 2015-05-14 | Huawei Technologies Co., Ltd. | Method and apparatus for compensating for voice packet loss |
CN104347076A (zh) * | 2013-08-09 | 2015-02-11 | 中国电信股份有限公司 | 网络音频丢包掩蔽方法和装置 |
US20150248893A1 (en) * | 2014-02-28 | 2015-09-03 | Google Inc. | Sinusoidal interpolation across missing data |
US10127918B1 (en) * | 2017-05-03 | 2018-11-13 | Amazon Technologies, Inc. | Methods for reconstructing an audio signal |
CN108171222A (zh) * | 2018-02-11 | 2018-06-15 | 清华大学 | 一种基于多流神经网络的实时视频分类方法及装置 |
CN109117777A (zh) * | 2018-08-03 | 2019-01-01 | 百度在线网络技术(北京)有限公司 | 生成信息的方法和装置 |
CN109218083A (zh) * | 2018-08-27 | 2019-01-15 | 广州爱拍网络科技有限公司 | 一种语音数据传输方法及装置 |
CN110392273A (zh) * | 2019-07-16 | 2019-10-29 | 北京达佳互联信息技术有限公司 | 音视频处理的方法、装置、电子设备及存储介质 |
CN110491407A (zh) * | 2019-08-15 | 2019-11-22 | 广州华多网络科技有限公司 | 语音降噪的方法、装置、电子设备及存储介质 |
Non-Patent Citations (3)
Title |
---|
REZA LOTFIDERESHGI,等: "Speech Prediction Using an Adaptive Recurrent Neural Network with Application to Packet Loss Concealment" * |
吴玉峰,等: "基于BP神经网络QoS到QoE映射模型" * |
李璐君,等: "一种基于组合深层模型的语音增强方法" * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7490894B2 (ja) | 2020-10-15 | 2024-05-27 | ドルビー・インターナショナル・アーベー | 深層生成ネットワークを用いたリアルタイムパケット損失隠蔽 |
CN112634912A (zh) * | 2020-12-18 | 2021-04-09 | 北京猿力未来科技有限公司 | 丢包补偿方法及装置 |
CN112634912B (zh) * | 2020-12-18 | 2024-04-09 | 北京猿力未来科技有限公司 | 丢包补偿方法及装置 |
CN113035211A (zh) * | 2021-03-11 | 2021-06-25 | 马上消费金融股份有限公司 | 音频压缩方法、音频解压缩方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN111640442B (zh) | 2023-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111640442A (zh) | 处理音频丢包的方法、训练神经网络的方法及各自的装置 | |
CN111653285B (zh) | 丢包补偿方法及装置 | |
US11100941B2 (en) | Speech enhancement and noise suppression systems and methods | |
US11514925B2 (en) | Using a predictive model to automatically enhance audio having various audio quality issues | |
WO2023056783A1 (zh) | 音频处理方法、相关设备、存储介质及程序产品 | |
WO2020015270A1 (zh) | 语音信号分离方法、装置、计算机设备以及存储介质 | |
CN113035207B (zh) | 音频处理方法及装置 | |
CN107103909A (zh) | 帧错误隐藏 | |
EP4172987A1 (en) | Speech enhancement | |
CN116208807A (zh) | 视频帧处理方法及装置、视频帧去噪方法及装置 | |
CN112289343A (zh) | 音频修复方法、装置及电子设备和计算机可读存储介质 | |
US11887277B2 (en) | Removing compression artifacts from digital images and videos utilizing generative machine-learning models | |
CN116959476A (zh) | 音频降噪处理方法和装置、存储介质及电子设备 | |
CN113096685B (zh) | 音频处理方法及装置 | |
CN115113855B (zh) | 音频数据处理方法、装置、电子设备、存储介质和产品 | |
Aironi et al. | A time-frequency generative adversarial based method for audio packet loss concealment | |
CN115936980A (zh) | 一种图像处理方法、装置、电子设备及存储介质 | |
CN113990347A (zh) | 一种信号处理方法、计算机设备及存储介质 | |
CN113571079A (zh) | 语音增强方法、装置、设备及存储介质 | |
Agarwal et al. | Deep residual neural networks for image in audio steganography (Workshop Paper) | |
CN116248229B (zh) | 一种面向实时语音通讯的丢包补偿方法 | |
US20240161766A1 (en) | Robustness/performance improvement for deep learning based speech enhancement against artifacts and distortion | |
CN114548046B (zh) | 文本处理方法以及装置 | |
CN116129921A (zh) | 一种声码器的训练方法、音频合成的方法和装置 | |
CN117453887A (zh) | 基于文档的问题作答方法、装置、存储介质及计算机设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information | ||
CB03 | Change of inventor or designer information |
Inventor after: Wang Xiaohong Inventor after: Chen Jialu Inventor after: Liu Lupeng Inventor after: Yuan Haiming Inventor after: Gao Qiang Inventor after: Xia Long Inventor after: Guo Changzhen Inventor before: Wang Xiaohong Inventor before: Chen Jialu Inventor before: Liu Lupeng Inventor before: Yuan Haiming Inventor before: Gao Qiang Inventor before: Xia Long Inventor before: Guo Changzhen |
|
GR01 | Patent grant | ||
GR01 | Patent grant |