WO2013020341A1 - 一种音效变音方法及装置 - Google Patents

一种音效变音方法及装置 Download PDF

Info

Publication number
WO2013020341A1
WO2013020341A1 PCT/CN2011/084151 CN2011084151W WO2013020341A1 WO 2013020341 A1 WO2013020341 A1 WO 2013020341A1 CN 2011084151 W CN2011084151 W CN 2011084151W WO 2013020341 A1 WO2013020341 A1 WO 2013020341A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
sound signal
sound effect
duration
signal
Prior art date
Application number
PCT/CN2011/084151
Other languages
English (en)
French (fr)
Inventor
赵伟峰
Original Assignee
深圳市万兴软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市万兴软件有限公司 filed Critical 深圳市万兴软件有限公司
Publication of WO2013020341A1 publication Critical patent/WO2013020341A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response

Definitions

  • the present invention relates to the field of digital sound effects technologies, and in particular, to a sound effect sounding method and apparatus. Background technique
  • the technical problem to be solved by the present invention is to provide a sound effect sounding method and device for realizing special sound effects such as a chipmunk, a belly language, and a ghost sound.
  • the embodiment of the invention provides a sound effect sounding method, and the method includes:
  • the original sound signal is adjusted by the waveform similar superposition method, and the adjustment factor is [0.5.
  • the sound signal adjusted for the duration is scaled using a repetitive algorithm such that the scaled sound signal is equal in duration to the original sound signal.
  • the method further includes:
  • the delay time t of the reverberation model is 200 ms.
  • the embodiment of the invention further provides a sound effect sounding device, and the sound effect sounding device comprises:
  • a sound signal input module configured to receive an original sound signal input
  • a sound-changing module configured to perform a voice processing on the original sound signal received by the sound signal input module to obtain a sound signal of a desired sound effect
  • the sound-changing module comprises:
  • a time adjustment unit configured to adjust the duration of the original sound signal by a waveform similar superposition method, and the adjustment factor has a value of [0.5, 2.0];
  • the re-sampling unit is configured to use a re-sampling algorithm to scale the sound signal adjusted by the duration, so that the scaled sound signal is equal to the original sound signal duration;
  • an output unit configured to output a sound signal obtained through the sound processing of the sound correction module.
  • the sound correction module obtains a sound signal to be a chipmunk sound effect.
  • the sound correction module obtains the sound signal as a belly sound effect.
  • the adjustment unit of the duration adjustment unit takes a value of 0.8
  • the sound modification module further includes: a reverberation unit, configured to pass the sound signal that has been scaled by the re-sample unit to the iir filter system.
  • the product of t where the delay time t uses the delay time of the echo [100ms, 300ms], and the sound signal of the reverberation system is the ghost sound effect.
  • the delay time t of the reverberation model is 200 ms.
  • the implementation of the embodiments of the present invention has the following beneficial effects: By translating the input original sound signal by the waveform similar superposition algorithm and the re-sampling algorithm, the special sounds required by various users can be realized. Effects, such as chipmunks, ventrilo, ghost sound effects, etc. DRAWINGS
  • FIG. 1 is a schematic structural diagram of a structure of a sound effect sounding device according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a sound effect sounding device according to another embodiment of the present invention
  • FIG. 3 is a schematic flow chart of a sound effect sounding method according to an embodiment of the present invention
  • FIG. 4 is a schematic flow chart of a sound effect sounding method according to another embodiment of the present invention. detailed description
  • FIG. 1 is a schematic structural diagram of a sound effect sounding device according to an embodiment of the present invention, and the sound effect sounding device includes:
  • the sound signal input module 10 is configured to receive the original sound signal input
  • the sound module 20 is configured to perform a sound processing on the original sound signal received by the sound signal input module 10 to obtain a sound signal of a desired sound effect, wherein the sound changing module 20 includes:
  • the time adjustment unit 21 is configured to adjust the length of the original sound signal by a waveform similar superposition method, and the adjustment factor takes a value of [0.5, 2.0]; wherein the waveform similarity superposition method may specifically be:
  • the multiples interval [1.0, 2.0] of the ascending adjustment and the multiple interval [0.5, 1.0] of the down-regulation are respectively divided into 12, and the value of the multiple interval of each division is 2, and The multiple of each peak of the division is taken as 2-1111 .
  • the analysis window frame shift needs to be moved each time according to the characteristics of the signal to ensure that the short-term periodicity of the speech signal is not destroyed, and that a click is avoided.
  • Wsola algorithm According to the waveform similarity calculation rule, the two superimposed signals still maintain peak and peak alignment while superimposing, while maintaining the short-term periodicity of the signal. Therefore, the frame shift Lb of the analysis window is actually centered on Lb, and the waveforms most similar to the superimposed portion of the search window are determined before and after the k sample values, so the frame shift of the analysis window is in the interval [Lb- Between k, Lb + k].
  • Two similarity options can be used: the normalized correlation coefficient and the AMDF (mean amplitude difference function) method.
  • the re-sampling unit 22 is configured to use a re-sampling algorithm to scale the sound signal adjusted by the duration, so that the scaled sound signal is equal in duration to the original sound signal. Specifically, after the wsola similar algorithm is performed, the obtained sound signal becomes r times of the original sound signal, that is, the length of the signal of the up-regulation becomes longer, and the signal duration of the peak adjustment becomes shorter.
  • the resample algorithm is used to scale the signal back to its original length to achieve a variable shift without shifting. In order to facilitate the calculation, each time the signal of La length is taken out from the signal after the change, and then the length of Lb is changed, which can satisfy the original length.
  • the sound-changing module can change the input sound signal to obtain different sound effects by using different adjustment factors.
  • the output unit 30 is configured to output a sound signal obtained through the sound processing of the sound modification module.
  • the sound signal obtained by the sound processing of the sound modification module 20 that is, the sound signal output by the output unit 30 is a chipmunk sound effect.
  • the adjustment factor of the duration adjusting unit 21 is 0.6
  • the sound processing of the sound converting module 20 obtains a sound signal, that is, the sound signal output by the output unit 30 is a belly sound effect.
  • FIG. 2 is a schematic structural diagram of a sound effect sounding device according to another embodiment of the present invention, and the sound effect sounding device includes:
  • the sound signal input module 10 is configured to receive the original sound signal input
  • the sound module 20 is configured to perform a sound processing on the original sound signal received by the sound signal input module 10 to obtain a sound signal of a desired sound effect, wherein the sound changing module 20 includes:
  • the length adjustment unit 21 and the re-sampling unit 22 are as described in the above embodiments, and are not described herein again.
  • the adjustment factor of the duration adjustment unit in this embodiment takes a value of 0.8.
  • the number of points, N is the product of the sampling rate and the delay time t.
  • the value of the delay time t used in the application of the reverberation model is generally less than 50 ms, and the delay time t in N in this embodiment is selected as the delay of the echo.
  • the time [100ms, 300ms] is used to implement the ghost sound effect.
  • the preferred delay time t in the present embodiment takes a value of 200ms, whereby the sound signal passing through the reverberation unit 23 is a ghost sound effect.
  • the output unit 30 is configured to output a sound signal obtained through the sound processing of the sound modification module.
  • the output unit 30 outputs the sound signal of the ghost sound effect output from the reverberation unit 23 in this embodiment.
  • FIG. 3 is a schematic flowchart of a sound effect sounding method according to an embodiment of the present invention.
  • Step S10 the original sound signal is adjusted by the waveform similar superposition method, and the adjustment factor is [0.5, 2.0]; wherein the waveform similar superposition method may specifically be:
  • the multiples interval [1.0, 2.0] of the ascending adjustment and the multiple interval [0.5, 1.0] of the down-regulation are respectively divided into 12, and the value of the multiple interval of each division is 2, and The multiple of each peak of the division is taken as 2-1111 .
  • the remaining La samples in the analysis signal are then added directly to the tail of the composite signal.
  • the two signals can be directly superimposed and divided by 2 to ensure that the amplitude is constant, or a triangular window can be used for superposition.
  • the analysis window frame shift needs to be moved each time according to the characteristics of the signal to ensure that the short-term periodicity of the speech signal is not destroyed, and that a click is avoided.
  • the wsola algorithm calculates the peak-to-peak alignment while maintaining the short-term periodicity of the signal based on the waveform similarity calculation rule. Therefore, the frame shift Lb of the analysis window is actually centered on Lb, and the waveforms of the k-sample values that are most similar to the superimposed portion of the synthesis window are determined before and after, so The frame shift of the window is between the intervals [Lb-k, Lb + k].
  • Two similarity options can be used: normalized correlation coefficient and AMDF (mean amplitude difference function) method.
  • step S20 the sound signal adjusted by the duration is scaled using a weighting algorithm, so that the scaled sound signal is equal to the original sound signal duration.
  • the obtained sound signal becomes r times of the original sound signal, that is, the length of the signal of the up-regulation becomes longer, and the length of the signal of the down-regulated becomes shorter.
  • the resample algorithm is used to scale the signal back to its original length to achieve a variable shift without shifting. In order to facilitate the calculation, each time the La length signal is taken out from the transposed signal, and then the Lb length is changed, so that the original length can be recovered.
  • the linear interpolation is used to stretch the original M times, that is, La* M, then take one every N, and extract it as La*M/N, which is the length of Lb. At this point, the shifting is not completed.
  • the original sound signal can be tuned into a desired sound signal by performing the above-described two-step gradation processing on the input original sound signal. For example, when the adjustment factor r is set to 1.8, the sound signal processed through the above two steps is the chipmunk sound effect. When the adjustment factor r is 0.6, the sound signal processed through the above two steps is the belly sound effect.
  • FIG. 4 is a schematic flow chart of a sound effect sounding method according to another embodiment of the present invention.
  • Step S10 and step S20 are the same as in the previous embodiment, and details are not described herein again.
  • the adjustment factor in this embodiment takes a value of 0.8.
  • the sound effect sounding method in this embodiment further includes after step S20:
  • the sound signal obtained after passing through the reverberation model is a ghost sound effect.
  • the storage medium may be a magnetic disk, an optical disk, or a read-only memory (Read-Only Memory, ROM) or random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

本发明实施例提供了一种音效变音方法,所述方法包括:将原声音信号经过波形相似叠加法进行时长调整;对经过时长调整的声音信号使用重釆样算法进行放缩,使得放缩后的声音信号与原声音信号时长相等。相应的,本发明实施例还公开了一种音效变音装置。采用本发明,可实现多种用户需要的特殊音效,例如花栗鼠、腹语、鬼音音效等。

Description

一种音效变音方法及装置
本申请要求于 2011年 08月 10日提交中国专利局、申请号为 201110228483.7 发明名称为 "一种音效变音方法及装置" 的中国专利申请的优先权, 其全部内 容通过引用结合在本申请中。
技术领域
本发明涉及数字音效技术领域, 尤其涉及一种音效变音方法及装置。 背景技术
日常生活中, 我们在收听各种声音文件时往往需要对某数字声音输出进行 变音处理, 得到人们需要的各种音效, 最常用的就是在听 MP3格式的音乐文件 时经常会使用 EQ均衡器对数字声音的音效进行调节, EQ均衡器变音改变音效 的原理是通过将数字声音信号分为多个频段, 分别对所述多个频段不同频率的 信号进行调节和增益, 只能祈祷补偿扬声器和声场的缺陷, 补偿和修饰各种声 源及其它辅助作用, 但是类似于花栗鼠、 腹语、 鬼音等特殊音效, 现有的变音 方法就无法实现了。 发明内容
有鉴于此, 本发明所要解决的技术问题在于, 提供一种音效变音方法及装 置, 以实现花栗鼠、 腹语、 鬼音等特殊音效。
本发明实施例提供了一种音效变音方法, 所述方法包括:
将原声音信号经过波形相似叠加法进行时长调整, 调整因子取值为 [0.5 ,
2.0];
对经过时长调整的声音信号使用重釆样算法进行放缩 , 使得放缩后的声音 信号与原声音信号时长相等。
其中, 所述调整因子取值为 1.8时, 所述放缩后得到的声音信号为花栗鼠音 效。 其中,所述调整因子取值为 0.6时,所 ^文缩后得到的声音信号为腹音音效。 其中, 所述调整因子取值为 0.8时, 所述方法还包括:
将经过放缩后的声音信号经过基于 iir滤波器系统的混响模型, 该混响模型 的系统函数为 H(Z) = l/(l - p.z , 其中 p为系统衰减系数, N为延迟釆样点数, N为釆样率和延迟时间 t的乘积, 其中延迟时间 t选用回音的延迟时间 [100ms, 300ms] , 经过此混响系统的声音信号为鬼音音效。
其中, 所述混响模型的延迟时间 t为 200ms。
相应的, 本发明实施例还提供了一种音效变音装置, 所述音效变音装置包 括:
声音信号输入模块, 用于接收原声音信号输入;
变音模块, 用于将所述声音信号输入模块接收到的原声音信号进行变音处 理, 得到所需音效的声音信号, 其中所述变音模块包括:
时长调整单元, 用于将所述原声音信号经过波形相似叠加法进行时长调整, 调整因子取值为 [0.5 , 2.0];
重釆样单元, 用于对经过时长调整的声音信号使用重釆样算法进行放缩, 使得放缩后的声音信号与原声音信号时长相等;
输出单元, 用于输出经过所述变音模块变音处理得到的声音信号。
其中, 所述时长调整单元的调整因子取值为 1.8时, 所述变音模块变音处理 得到声音信号为花栗鼠音效。
其中, 所述时长调整单元的调整因子取值为 0.6时, 所述变音模块变音处理 得到声音信号为腹音音效。
其中, 所述时长调整单元的调整因子取值为 0.8, 所述变音模块还包括: 混响单元, 用于将经过所述重釆样单元放缩后的声音信号经过基于 iir滤波 器系统的混响模型, 该混响模型的系统函数为 H(Z) = l/(l - p - ZN ) , 其中 p为系统 衰减系数, N为延迟釆样点数, N为釆样率和延迟时间 t的乘积, 其中延迟时间 t选用回音的延迟时间 [100ms, 300ms] , 经过此混响系统的声音信号为鬼音音 效。
其中, 所述混响模型的延迟时间 t为 200ms。
实施本发明实施例, 具有如下有益效果: 通过将输入的原声音信号通过波 形相似叠加算法和重釆样算法进行变调不变速, 可实现多种用户需要的特殊音 效, 例如花栗鼠、 腹语、 鬼音音效等。 附图说明
图 1为本发明实施例中一种音效变音装置的组成结构示意图;
图 2为本发明另一实施例中的一种音效变音装置的组成结构示意图; 图 3为本发明实施例中一种音效变音方法的流程示意图;
图 4为本发明另一实施例中的一种音效变音方法的流程示意图。 具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是 全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作出创造 性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。
图 1 为本发明实施例中一种音效变音装置的组成结构示意图, 如图所示该 音效变音装置包括:
声音信号输入模块 10, 用于接收原声音信号输入;
变音模块 20,用于将所述声音信号输入模块 10接收到的原声音信号进行变 音处理, 得到所需音效的声音信号, 其中所述变音模块 20包括:
时长调整单元 21 , 用于将所述原声音信号经过波形相似叠加法进行时长调 整, 调整因子取值为 [0.5 , 2.0]; 其中所述波形相似叠加法具体可以为:
根据十二定律分别将升调的倍数区间 [1.0, 2.0]和降调的倍数区间 [0.5 , 1.0] 各划分为 12份, 其中划分得到每份升调的倍数区间的取值是 2 , 而划分得到 的每份峰调的倍数区间的取值为 2- 1111
将原声音信号进行加窗分帧处理, 假设窗长即分析信号帧的帧长 Lw, 合成 帧的帧长也为 Lw, 在合成阶段需要进行叠加抵消窗效应, 重叠相加(ola )长度 Lo为 0.5Lw, 假设合成窗帧移为 La, 则分析窗帧移为 Lb=La/r, r为调整因子, 取值为 [0.5 , 2.0]。 在原声音信号中, 每移动 Lb个长度, 取出 Lw个分析信号, 然后与分析帧中的信号叠加, 分析信号帧长 Lw中的前 Lo个信号与合成信号的 后 Lo个釆样点进行叠加, 然后将分析信号中剩余的 La个釆样点直接添加到合 成信号尾部。 在进行重叠相加时, 可以直接用对两个信号叠加后除以 2来保证 幅度不变, 也可以釆用三角窗进行叠加。
实际上, 由于语音信号的短时周期性, 分析窗帧移需要每次根据信号的特 性来移动以保证不破坏语音信号的短时周期性, 而避免产生咔咔声。 wsola算法 根据波形相似性计算法则, 使得叠加的两个信号在叠加时候仍然保持波峰和波 峰对齐, 而保持信号的短时周期性。 因此, 分析窗的帧移 Lb实际上是以 Lb为 中心点, 前后确定 k个釆样值中搜索与合成窗叠加部分最相似的波形, 因此分 析窗的帧移每次为于区间 [Lb-k, Lb +k]之间。 其中可以釆用两种相似度选择方 案: 归一化相关系数和 AMDF (平均幅度差值函数) 法。
重釆样单元 22,用于对经过时长调整的声音信号使用重釆样算法进行放缩 , 使得放缩后的声音信号与原声音信号时长相等。具体的, 由于进行了 wsola相似 算法之后,得到的声音信号变为原声音信号时长的 r倍,即升调的信号时长变长, 峰调的信号时长变短。 需要用 resample (重釆样)算法对信号进行放缩, 恢复到 原来的时长以实现变调不变速。 为了方便计算, 每次从变调后的信号中取出 La 个长度的信号, 然后变为 Lb长度, 这样能够满足正好恢复原始长度。 具体可以 为: 先求 La, Lb的最大公约数, 化简分数 Lb/La为最简分数 M/N, 对于输入 La长的信号, 通过线性插值拉伸为原来的 M倍, 即, La*M, 然后每隔 N个取 一个, 抽取为 La*M/N, 即 Lb长度。 至此, 变调不变速实现完毕, 该变音模块 可以通过釆用不同取值的调整因子, 对输入的声音信号进行变音得到不同音效 的声音信号。
输出单元 30, 用于输出经过所述变音模块变音处理得到的声音信号。 具体 实现中, 当所述时长调整单元 21的调整因子取值为 1.8时, 所述变音模块 20变 音处理得到的声音信号即输出单元 30输出的声音信号为花栗鼠音效。 而当所述 时长调整单元 21的调整因子取值为 0.6时,所述变音模块 20变音处理得到声音 信号即输出单元 30输出的声音信号为腹音音效。
图 2 为本发明另一实施例中的一种音效变音装置的组成结构示意图, 如图 所示该音效变音装置包括:
声音信号输入模块 10, 用于接收原声音信号输入;
变音模块 20,用于将所述声音信号输入模块 10接收到的原声音信号进行变 音处理, 得到所需音效的声音信号, 其中所述变音模块 20包括:
时长调整单元 21和重釆样单元 22如上文实施例中所述, 于此不再赘述, 本实施例中的时长调整单元的调整因子取值为 0.8。 本实施例中的变音模块还包 括混响单元 23 , 用于将经过所述重釆样单元 22放缩后的声音信号经过基于 iir 滤波器系统的混响模型, 该混响模型的系统函数为 H(Z) = l/(l - p - ZN ) , 其中 p为 系统衰减系数, 取值区间为 [0, 1], 根据经验值, p可以取值为 0.5 , N为延迟釆 样点数, N为釆样率和延迟时间 t的乘积, 现有应用该混响模型时选用的延迟时 间 t的取值一般都小于 50ms, 而本实施例中的 N中延迟时间 t选用回音的延迟 时间 [100ms, 300ms]来实现鬼音效果, 较优的本实施例中的延迟时间 t取值为 200ms, 由此经过混响单元 23的声音信号为鬼音音效。
输出单元 30, 用于输出经过所述变音模块变音处理得到的声音信号。 本实 施例中输出单元 30输出的为混响单元 23输出的鬼音音效的声音信号。
图 3 为本发明实施例中一种音效变音方法的流程示意图; 如图所示该方法 流程包括:
步骤 S10,将原声音信号经过波形相似叠加法进行时长调整,调整因子取值 为 [0.5 , 2.0]; 其中所述波形相似叠加法具体可以为:
根据十二定律分别将升调的倍数区间 [1.0, 2.0]和降调的倍数区间 [0.5 , 1.0] 各划分为 12份, 其中划分得到每份升调的倍数区间的取值是 2 , 而划分得到 的每份峰调的倍数区间的取值为 2- 1111
将原声音信号进行加窗分帧处理, 假设窗长即分析信号帧的帧长 Lw, 合成 帧的帧长也为 Lw, 在合成阶段需要进行叠加抵消窗效应, 重叠相加(ola )长度 Lo为 0.5Lw, 假设合成窗帧移为 La, 则分析窗帧移为 Lb=La/r, r为调整因子, 取值为 [0.5 , 2.0]。 在原声音信号中, 每移动 Lb个长度, 取出 Lw个分析信号, 然后与分析帧中的信号叠加, 分析信号帧长 Lw中的前 Lo个信号与合成信号的 后 Lo个釆样点进行叠加, 然后将分析信号中剩余的 La个釆样点直接添加到合 成信号尾部。 在进行重叠相加时, 可以直接用对两个信号叠加后除以 2来保证 幅度不变, 也可以釆用三角窗进行叠加。
实际上, 由于语音信号的短时周期性, 分析窗帧移需要每次根据信号的特 性来移动以保证不破坏语音信号的短时周期性, 而避免产生咔咔声。 wsola算法 根据波形相似性计算法则, 使得叠加的两个信号在叠加时候仍然保持波峰和波 峰对齐, 而保持信号的短时周期性。 因此, 分析窗的帧移 Lb实际上是以 Lb为 中心点, 前后确定 k个釆样值中搜索与合成窗叠加部分最相似的波形, 因此分 析窗的帧移每次为于区间 [Lb-k, Lb +k]之间。 其中可以釆用两种相似度选择方 案: 归一化相关系数和 AMDF (平均幅度差值函数) 法。
步骤 S20,对经过时长调整的声音信号使用重釆样算法进行放缩,使得放缩 后的声音信号与原声音信号时长相等。具体的,由于进行了 wsola相似算法之后, 得到的声音信号变为原声音信号时长的 r倍, 即升调的信号时长变长, 降调的信 号时长变短。 需要用 resample (重釆样)算法对信号进行放缩, 恢复到原来的时 长以实现变调不变速。 为了方便计算, 每次从变调后的信号中取出 La个长度的 信号, 然后变为 Lb长度, 这样能够满足正好恢复原始长度。 具体可以为: 先求 La, Lb的最大公约数, 化简分数 Lb/La为最简分数 M/N, 对于输入 La长的信 号, 通过线性插值拉伸为原来的 M倍, 即, La*M, 然后每隔 N个取一个, 抽 取为 La*M/N, 即 Lb长度。 至此, 变调不变速实现完毕。
当釆用不同取值的调整因子时, 通过对输入的原声音信号进行上述两个步 骤的变音处理, 可以将原声音信号变音成需要的声音信号。 例如, 当调整因子 r 取值为为 1.8时, 经过上述两个步骤处理得到的声音信号即为花栗鼠音效。 而当 调整因子 r取值为 0.6时 ,经过上述两个步骤处理得到的声音信号即为腹音音效。
图 4为本发明另一实施例中的一种音效变音方法的流程示意图。
步骤 S10和步骤 S20与上一实施例中相同, 于此不再赘述。 本实施例中的 调整因子取值为 0.8。 本实施例中的音效变音方法在步骤 S20后还包括:
步骤 S30, 将经过放缩后的声音信号经过基于 iir滤波器系统的混响模型, 该混响模型的系统函数为 (Ζ) = ΐ/(1 -ρ ·ζ , 其中 ρ为系统衰减系数, Ν为延迟 釆样点数, Ν为釆样率和延迟时间 t的乘积, 其中延迟时间 t选用回音的延迟时 间 [100ms, 300ms], 较优的本实施例中的延迟时间 t可以取值为 200ms, 由此 经过混响模型后得到的声音信号为鬼音音效。
以上所揭露的仅为本发明一种较佳实施例而已, 当然不能以此来限定本发 明之权利范围, 本领域普通技术人员可以理解实现上述实施例的全部或部分流 程, 并依本发明权利要求所作的等同变化, 仍属于发明所涵盖的范围。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可以通过计算机程序来指令相关的硬件来完成, 所述的程序可存储于一计算 机可读取存储介质中, 该程序在执行时, 可包括如上述各方法的实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读存储记忆体(Read-Only Memory, ROM)或随机存储记忆体(Random Access Memory, RAM)等。

Claims

权 利 要 求
1. 一种音效变音方法, 其特征在于, 所述方法包括:
将原声音信号经过波形相似叠加法进行时长调整, 调整因子取值为 [0.5 ,
2.0];
对经过时长调整的声音信号使用重釆样算法进行放缩 , 使得放缩后的声音 信号与原声音信号时长相等。
2. 如权利要求 1所述的音效变音方法, 其特征在于, 所述调整因子取值为 1.8时, 所述放缩后得到的声音信号为花栗鼠音效。
3. 如权利要求 1所述的音效变音方法, 其特征在于, 所述调整因子取值为 0.6时, 所述放缩后得到的声音信号为腹音音效。
4. 如权利要求 1-3任一项所述的音效变音方法, 其特征在于, 所述调整因 子取值为 0.8时, 所述方法还包括:
将经过放缩后的声音信号经过基于 iir滤波器系统的混响模型, 该混响模型 的系统函数为 H(Z) = l/(l - p.z , 其中 p为系统衰减系数, N为延迟釆样点数, N为釆样率和延迟时间 t的乘积, 其中延迟时间 t选用回音的延迟时间 [100ms, 300ms] , 经过此混响系统的声音信号为鬼音音效。
5. 如权利要求 4所述的音效变音方法, 其特征在于, 所述混响模型的延迟 时间 t为 200ms。
6. 一种音效变音装置, 其特征在于, 所述音效变音装置包括:
声音信号输入模块, 用于接收原声音信号输入;
变音模块, 用于将所述声音信号输入模块接收到的原声音信号进行变音处 理, 得到所需音效的声音信号, 其中所述变音模块包括:
时长调整单元, 用于将所述原声音信号经过波形相似叠加法进行时长调整, 调整因子取值为 [0.5 , 2.0];
重釆样单元, 用于对经过时长调整的声音信号使用重釆样算法进行放缩, 使得放缩后的声音信号与原声音信号时长相等;
输出单元, 用于输出经过所述变音模块变音处理得到的声音信号。
7. 如权利要求 6所述的音效变音装置, 其特征在于, 所述时长调整单元的 调整因子取值为 1.8时, 所述变音模块变音处理得到声音信号为花栗鼠音效。
8. 如权利要求 6所述的音效变音装置, 其特征在于, 所述时长调整单元的 调整因子取值为 0.6时, 所述变音模块变音处理得到声音信号为腹音音效。
9. 如权利要求 6-8任一项所述的音效变音装置, 其特征在于, 所述时长调 整单元的调整因子取值为 0.8, 所述变音模块还包括:
混响单元, 用于将经过所述重釆样单元放缩后的声音信号经过基于 iir滤波 器系统的混响模型, 该混响模型的系统函数为 H(Z) = l/(l - p - ZN ) , 其中 p为系统 衰减系数, N为延迟釆样点数, N为釆样率和延迟时间 t的乘积, 其中延迟时间 t选用回音的延迟时间 [100ms, 300ms] , 经过此混响系统的声音信号为鬼音音 效。
10. 如权利要求 9所述的音效变音装置, 其特征在于, 所述混响模型的延迟 时间 t为 200ms。
PCT/CN2011/084151 2011-08-10 2011-12-16 一种音效变音方法及装置 WO2013020341A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2011102284837 2011-08-10
CN201110228483.7A CN102307327B (zh) 2011-08-10 2011-08-10 一种音效变音方法及装置

Publications (1)

Publication Number Publication Date
WO2013020341A1 true WO2013020341A1 (zh) 2013-02-14

Family

ID=45381122

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/084151 WO2013020341A1 (zh) 2011-08-10 2011-12-16 一种音效变音方法及装置

Country Status (2)

Country Link
CN (1) CN102307327B (zh)
WO (1) WO2013020341A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104575487A (zh) * 2014-12-11 2015-04-29 百度在线网络技术(北京)有限公司 一种语音信号的处理方法及装置
CN109413492B (zh) * 2017-08-18 2021-05-28 武汉斗鱼网络科技有限公司 一种直播过程中音频数据混响处理方法及系统
CN108492832A (zh) * 2018-03-21 2018-09-04 北京理工大学 基于小波变换的高质量声音变换方法
CN108682426A (zh) * 2018-05-17 2018-10-19 深圳市沃特沃德股份有限公司 语音声色转换方法及装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1719514A (zh) * 2004-07-06 2006-01-11 中国科学院自动化研究所 基于语音分析与合成的高品质实时变声方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01198799A (ja) * 1988-02-03 1989-08-10 Victor Co Of Japan Ltd 遅延信号処理回路
CN1719512B (zh) * 2005-07-15 2010-09-29 北京中星微电子有限公司 数字音频混响模拟系统以及数字音频混响模拟方法

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1719514A (zh) * 2004-07-06 2006-01-11 中国科学院自动化研究所 基于语音分析与合成的高品质实时变声方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XU, XUEQIONG ET AL.: "Time-Scale Modification of Audio Signal Using Improved WSOLA Algorithm", JOURNAL OF APPLIED SCIENCES, vol. 27, no. 5, September 2009 (2009-09-01), pages 514 - 519 *
YE, XI'EN ET AL.: "Study on Time-Scale Modification of Speech Based on WSOLA", BULLETIN OF SCIENCE AND TECHNOLOGY, vol. 21, no. 5, September 2005 (2005-09-01), pages 593 - 596 *
ZHANG, XIAORUI ET AL.: "Study of pitch shifting technology and the sound quality evaluating", JOURNAL OF SHANDONG UNIVERSITY (ENGINEERING SCIENCE), vol. 41, no. 1, February 2011 (2011-02-01), pages 1 - 6 *

Also Published As

Publication number Publication date
CN102307327A (zh) 2012-01-04
CN102307327B (zh) 2015-08-19

Similar Documents

Publication Publication Date Title
CN109285557B (zh) 一种定向拾音方法、装置及电子设备
JP6023823B2 (ja) 音声信号を混合する方法、装置及びコンピュータプログラム
CN106057220B (zh) 一种音频信号的高频扩展方法和音频播放器
CN102419981A (zh) 音频信号时间尺度和频率尺度缩放处理方法及设备
WO2013020341A1 (zh) 一种音效变音方法及装置
EP2907324B1 (en) System and method for reducing latency in transposer-based virtual bass systems
Deroche et al. Voice segregation by difference in fundamental frequency: Effect of masker type
US9240196B2 (en) Apparatus and method for handling transient sound events in audio signals when changing the replay speed or pitch
JP2007033804A (ja) 音源分離装置,音源分離プログラム及び音源分離方法
US11107488B1 (en) Reduced reference canceller
US20240161762A1 (en) Full-band audio signal reconstruction enabled by output from a machine learning model
KR102329707B1 (ko) 다중채널 오디오 신호를 처리하는 장치 및 방법
CN114007176B (zh) 用于降低信号延时的音频信号处理方法、装置及存储介质
US11462231B1 (en) Spectral smoothing method for noise reduction
WO2017098307A1 (zh) 基于谐波模型和声源-声道特征分解的语音分析合成方法
CN114694665A (zh) 语音信号的处理方法和装置,存储介质和电子设备
JP3869823B2 (ja) 音声の周波数特性の等化装置
Khadanovich et al. Step-by-Step Speech Enhancement Using Stacked Hourglass Wave Networks
Eisenberg et al. A two-stage speaker extraction algorithm under adverse acoustic conditions using a single-microphone
CN116978400A (zh) 音频曲线变速播放方法及相关设备
GB2617613A (en) An audio processing method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11870550

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11870550

Country of ref document: EP

Kind code of ref document: A1