CN114121033B - 基于深度学习的列车广播语音增强方法和系统 - Google Patents
基于深度学习的列车广播语音增强方法和系统 Download PDFInfo
- Publication number
- CN114121033B CN114121033B CN202210099789.5A CN202210099789A CN114121033B CN 114121033 B CN114121033 B CN 114121033B CN 202210099789 A CN202210099789 A CN 202210099789A CN 114121033 B CN114121033 B CN 114121033B
- Authority
- CN
- China
- Prior art keywords
- train
- scene
- audio
- information
- deep learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000013135 deep learning Methods 0.000 title claims abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 23
- 230000000694 effects Effects 0.000 claims abstract description 16
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims description 11
- 238000003062 neural network model Methods 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000003321 amplification Effects 0.000 description 6
- 238000003199 nucleic acid amplification method Methods 0.000 description 6
- 238000001514 detection method Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Quality & Reliability (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210099789.5A CN114121033B (zh) | 2022-01-27 | 2022-01-27 | 基于深度学习的列车广播语音增强方法和系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210099789.5A CN114121033B (zh) | 2022-01-27 | 2022-01-27 | 基于深度学习的列车广播语音增强方法和系统 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114121033A CN114121033A (zh) | 2022-03-01 |
CN114121033B true CN114121033B (zh) | 2022-04-26 |
Family
ID=80361698
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210099789.5A Active CN114121033B (zh) | 2022-01-27 | 2022-01-27 | 基于深度学习的列车广播语音增强方法和系统 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114121033B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114420132A (zh) * | 2022-03-28 | 2022-04-29 | 天津市北海通信技术有限公司 | 一种列车语音播报内容校验方法、系统和存储介质 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN203491984U (zh) * | 2013-08-30 | 2014-03-19 | 深圳市诺威达科技有限公司 | 自动增益处理系统 |
CN103617797A (zh) * | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | 一种语音处理方法,及装置 |
CN106486127A (zh) * | 2015-08-25 | 2017-03-08 | 中兴通讯股份有限公司 | 一种语音识别参数自动调整的方法、装置及移动终端 |
CN105787005B (zh) * | 2016-02-22 | 2019-09-20 | 腾讯科技(深圳)有限公司 | 信息处理方法及移动终端 |
CN106952650B (zh) * | 2017-02-28 | 2019-10-11 | 大连理工大学 | 一种基于arm+fpga架构的列车语音放大单元 |
KR20180130672A (ko) * | 2017-05-30 | 2018-12-10 | 현대자동차주식회사 | 상황 기반 대화 개시 장치, 시스템, 차량 및 방법 |
CN110049403A (zh) * | 2018-01-17 | 2019-07-23 | 北京小鸟听听科技有限公司 | 一种基于场景识别的自适应音频控制装置和方法 |
CN108621930B (zh) * | 2018-04-23 | 2022-02-18 | 上海迪彼电子科技有限公司 | 汽车主动控制声音增强的方法及系统 |
CN113129917A (zh) * | 2020-01-15 | 2021-07-16 | 荣耀终端有限公司 | 基于场景识别的语音处理方法及其装置、介质和系统 |
CN111464913A (zh) * | 2020-05-11 | 2020-07-28 | 广州橙行智动汽车科技有限公司 | 车辆的音频播放控制方法及装置、车辆和可读存储介质 |
CN112216300A (zh) * | 2020-09-25 | 2021-01-12 | 三一专用汽车有限责任公司 | 用于搅拌车驾驶室内声音的降噪方法、装置和搅拌车 |
CN112700672A (zh) * | 2020-12-21 | 2021-04-23 | 深圳供电局有限公司 | 一种智能语音播报系统及方法 |
-
2022
- 2022-01-27 CN CN202210099789.5A patent/CN114121033B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN114121033A (zh) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108877823B (zh) | 语音增强方法和装置 | |
CN110600059B (zh) | 声学事件检测方法、装置、电子设备及存储介质 | |
CN109036460B (zh) | 基于多模型神经网络的语音处理方法和装置 | |
CN113205803B (zh) | 一种具有自适应降噪能力的语音识别方法及装置 | |
CN111540342B (zh) | 一种能量阈值调整方法、装置、设备及介质 | |
CN114121033B (zh) | 基于深度学习的列车广播语音增强方法和系统 | |
CN117095694B (zh) | 一种基于标签层级结构属性关系的鸟类鸣声识别方法 | |
CN110600054A (zh) | 基于网络模型融合的声场景分类方法 | |
CN114338623B (zh) | 音频的处理方法、装置、设备及介质 | |
CN113793624B (zh) | 一种声学场景分类方法 | |
CN113593601A (zh) | 基于深度学习的视听多模态语音分离方法 | |
Illium et al. | Surgical mask detection with convolutional neural networks and data augmentations on spectrograms | |
CN114822578A (zh) | 语音降噪方法、装置、设备及存储介质 | |
CN114189781A (zh) | 双麦神经网络降噪耳机的降噪方法及系统 | |
CN113823303A (zh) | 音频降噪方法、装置及计算机可读存储介质 | |
CN114550740B (zh) | 噪声下的语音清晰度算法及其列车音频播放方法、系统 | |
CN116229987B (zh) | 一种校园语音识别的方法、装置及存储介质 | |
CN116959467A (zh) | 一种融合噪声场景的通信增强方法、系统及存储介质 | |
TWI779261B (zh) | 風切濾波裝置 | |
CN111061909B (zh) | 一种伴奏分类方法和装置 | |
KR20220053498A (ko) | 기계 학습 모델을 이용하여 복수의 신호 성분을 포함하는 오디오 신호 처리 장치 | |
CN113380244A (zh) | 一种设备播放音量的智能调节方法和系统 | |
CN114333767A (zh) | 发声者语音抽取方法、装置、存储介质及电子设备 | |
Vilouras | Acoustic scene classification using fully convolutional neural networks and per-channel energy normalization | |
CN117524252B (zh) | 一种基于醉汉模型的轻量化声学场景感知方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: 518000, 202 West Side, Assembly Plant 2, Shapu Community Rail Transit Equipment Science and Technology Park, Songgang Street, Bao'an District, Shenzhen, Guangdong Province Patentee after: Beihai Communication (Shenzhen) Group Co.,Ltd. Country or region after: China Address before: 518000 Room 403, building B, new retail Digital Industrial Park, Nanchang community, Xixiang street, Bao'an District, Shenzhen, Guangdong Patentee before: SHENZHEN BEIHAI RAIL TRANSIT TECHNOLOGY CO.,LTD. Country or region before: China |