WO2014161388A1 - 一种提高语音质量的方法及装置 - Google Patents

一种提高语音质量的方法及装置 Download PDF

Info

Publication number
WO2014161388A1
WO2014161388A1 PCT/CN2014/071868 CN2014071868W WO2014161388A1 WO 2014161388 A1 WO2014161388 A1 WO 2014161388A1 CN 2014071868 W CN2014071868 W CN 2014071868W WO 2014161388 A1 WO2014161388 A1 WO 2014161388A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
speech
amplitude
voice
unvoiced
Prior art date
Application number
PCT/CN2014/071868
Other languages
English (en)
French (fr)
Inventor
孙焘
梁超
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2014161388A1 publication Critical patent/WO2014161388A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Definitions

  • the present invention relates to the field of voice signal processing, and in particular, to a method and apparatus for improving voice quality.
  • the speakers used are small, and the required volume of the sound cavity reserved for the speakers is also small; in addition, most of the current mobile phone calls are established in the CS domain.
  • the speech coding algorithm often uses only 300Hz-3400Hz speech, even if the wideband speech signal is extended to about 6000Hz, and the speaker and matching cavity design according to the related technology are designed.
  • the voice signal is often broken when the voice signal is transmitted to the speaker sound system, and the definition is not sufficient.
  • the way to improve the quality and loudness of the terminal speaker receiving sound is to improve the sound quality and loudness from the hardware.
  • this method needs to increase the volume of the speaker.
  • the characteristic of the speaker is that as the volume increases, the effective acoustic radiation power also increases, thereby making up for the small sound volume of the small volume speaker and the large attenuation during transmission. , to ensure that more sound signals enter the human ear, so as to improve the clarity and legibility of the call.
  • it is to increase the electric power input from the circuit to the speaker, so that the speaker can work at a higher power, thereby compensating for the attenuation of the sound during transmission, and ensuring that more sound signals enter the human ear, which can also improve the call time. Sharpness and legibility.
  • these methods have great defects.
  • the increase in the volume of the speaker not only requires more space on the terminal, but also increases the corresponding sound cavity. Otherwise, the sound quality and volume will still be affected, and the related trend will be super.
  • Such a large increase in volume requirements for mobile terminals such as thin mobile phones cannot be satisfied; therefore, when the speaker volume is limited, the volume and sound quality can only be improved by increasing the electric power of the input speaker, but it is easy to appear. Power overload, breaks, and even damage to the speaker.
  • Embodiments of the present invention provide a method and apparatus for improving voice quality, and solving the problem of increasing voice quality, increasing speaker volume, and increasing electrical power of an input speaker.
  • the embodiment of the invention provides a method for improving voice quality, including:
  • Extracting a feature speech signal from the to-be-processed speech signal
  • the adjusted feature speech signal and other speech signals included in the speech signal to be processed are reconstructed to obtain a processed speech signal.
  • the feature speech signal comprises a speech pitch signal and/or a voice unvoice signal.
  • the preset rule when the feature speech signal includes a voice pitch signal, the preset rule includes:
  • the preset rule includes:
  • the voice unvoiced signal When the amplitude of the voice unvoiced signal is less than the minimum unvoiced signal amplitude threshold, the voice unvoiced signal is adjusted to be equal to or greater than the minimum unvoiced signal amplitude threshold; the amplitude of the voice unvoiced signal is greater than the highest unvoiced signal amplitude When the value is wide, the voice unvoiced signal is adjusted to be less than or equal to the highest unvoiced signal amplitude threshold.
  • the method further includes: determining the adjusted characteristic speech signal and the previously extracted original Whether the consistency of the feature speech signal satisfies the preset requirement; if not, the amplitude of the feature speech signal is readjusted.
  • the method further includes:
  • An embodiment of the present invention further provides an apparatus for improving voice quality, including: a voice extraction module configured to extract a feature voice signal from a to-be-processed voice signal; and a voice processing module configured to pair the extracted feature voice signal The amplitude is adjusted according to preset rules;
  • a voice reconstruction module configured to reconstruct the adjusted feature voice signal and other voice signals included in the to-be-processed voice signal to obtain a processed voice signal.
  • the feature speech signal comprises a speech pitch signal and/or a voice unvoice signal.
  • the preset rule when the feature speech signal includes a voice pitch signal, the preset rule includes:
  • the preset rule includes:
  • the voice unvoiced signal When the amplitude of the voice unvoiced signal is less than the minimum unvoiced signal amplitude threshold, the voice unvoiced signal is adjusted to be equal to or greater than the minimum unvoiced signal amplitude threshold; the amplitude of the voice unvoiced signal is greater than the highest unvoiced signal amplitude When the value is wide, the voice unvoiced signal is adjusted to be less than or equal to the highest unvoiced signal amplitude threshold.
  • the device further includes a determining module, the determining module is configured to adjust the voice reconstruction module based on the adjustment of the amplitude of the feature voice signal by the voice processing module Before the reconstructed feature speech signal is reconstructed, it is determined whether the consistency between the adjusted feature speech signal and the original feature speech signal extracted by the speech extraction module satisfies a preset requirement; if not, the voice processing module is notified The amplitude of the characteristic speech signal is readjusted.
  • the apparatus further includes a voice extension module, configured to reconstruct, by the voice reconstruction module, the adjusted feature voice signal and other voice signals included in the to-be-processed voice signal. After the processed voice signal, adjusted according to the voice processing module The magnitude of the subsequent characteristic speech signal is subjected to an expansion process on the processed speech signal.
  • a voice extension module configured to reconstruct, by the voice reconstruction module, the adjusted feature voice signal and other voice signals included in the to-be-processed voice signal. After the processed voice signal, adjusted according to the voice processing module The magnitude of the subsequent characteristic speech signal is subjected to an expansion process on the processed speech signal.
  • a method and apparatus for improving voice quality for extracting a feature speech signal from a speech signal to be processed (for example, a speech signal input to a speaker); and then, according to a preset amplitude of the extracted feature speech signal
  • the rules are adjusted to be within a preset amplitude range to ensure better speech quality; then the adjusted feature speech signal and other speech signals in the speech signal to be processed are reconstructed to obtain a better speech quality.
  • Voice signal therefore, the method and apparatus for improving voice quality can improve the quality of a voice signal without increasing the speaker volume or increasing the power input by the speaker, thereby avoiding an increase in speaker volume and an increase in input power. A problem that can give users a better experience.
  • FIG. 1 is a schematic flow chart of a method for improving the amount of the word in the embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of an apparatus for improving a voice amount according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of an apparatus for improving voice 'quantity according to an embodiment of the present invention.
  • Figure 4 is a schematic view showing the structure of a device for improving the voice amount in the embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of a method for improving a voice amount according to an embodiment of the present invention.
  • the feature speech signal is extracted from the to-be-processed speech signal; then the amplitude of the extracted feature speech signal is adjusted according to a preset rule to be within a preset amplitude range; then the adjusted
  • the feature speech signal and other speech signals in the to-be-processed speech signal are reconstructed to obtain a speech signal with better speech quality.
  • Embodiment 1 Referring to FIG. 1, the method for improving voice quality provided by this embodiment includes:
  • Step 101 Extract a feature voice signal from a to-be-processed voice signal
  • Which feature speech signal is extracted in this step can be selected according to the input speech signal condition and the corresponding application scenario, as long as the characteristic speech signal has a certain representativeness and can meet the requirements of subsequent speech reconstruction;
  • Step 102 Adjust the amplitude of the extracted feature speech signal according to a preset rule.
  • This step mainly adjusts the amplitude of the extracted feature speech signal according to a certain preset rule to make it at an optimal amplitude.
  • the optimal amplitude range is selected according to the corresponding application scenario and the amplitude distribution of the currently processed speech signal;
  • Step 103 reconstruct the adjusted feature speech signal and other speech signals included in the to-be-processed speech signal to obtain a processed speech signal.
  • the voice signal processed in this step is compared with the unprocessed voice signal, and at least one of the characteristic voice signals included is adjusted in amplitude, so that the quality of the reconstructed voice signal is better than the quality of the voice signal before processing. And this kind of processing does not need to increase the volume of the speaker, and does not need to increase the input electric power of the speaker, and therefore does not cause the power of the speaker to be overloaded, broken or even damaged.
  • the extracted feature speech signal may be a voice pitch signal, or a voice unvoiced signal, or a voice pitch signal and a voice unvoiced signal; corresponding feature speeches may be selected according to corresponding situations; for example, When the speech signal to be processed includes less speech unvoiced signals, or the amplitude of the speech unvoiced signal included therein is very low, and the speech pitch signals included therein are relatively large, only the speech pitch can be extracted at this time.
  • the signal is subjected to the above processing, which can also improve the speech quality to a certain extent; on the contrary, when the proportion of the speech pitch signal is small, and the proportion of the speech unvoiced signal is much larger than the speech pitch signal, only the speech can be extracted.
  • the above processing of the unvoiced signal can also improve the speech quality to a certain extent; when the proportion of the speech pitch signal and the voice unvoiced signal is similar, the speech pitch signal and the voice unvoiced signal can be extracted for the above processing.
  • the basis for extracting the signature signal is not limited to the above, and is merely an explanatory explanation.
  • the extracted feature speech signal includes a voice pitch signal
  • the rules include:
  • the amplitude of the voice pitch signal is smaller than the minimum pitch signal amplitude threshold, it is adjusted to be equal to or greater than the lowest pitch signal amplitude threshold; when the amplitude of the voice pitch signal is greater than the highest pitch signal amplitude threshold, It is adjusted to be less than or equal to the highest pitch signal amplitude threshold.
  • the preset rules used include:
  • the amplitude of the voice unvoiced signal When the amplitude of the voice unvoiced signal is less than the minimum amplitude of the lowest unvoiced signal, it is adjusted to be equal to or greater than the amplitude of the lowest unvoiced signal; when the amplitude of the voice unvoiced signal is greater than the amplitude of the highest unvoiced signal, adjust it Is less than or equal to the highest unvoiced signal amplitude threshold.
  • the amplitude of the feature speech signal is prevented from being adjusted to cause distortion, and after the amplitude of the feature speech signal is adjusted, the voice signal is reconstructed based on the adjusted feature speech signal.
  • the method further includes: determining whether the consistency of the adjusted feature speech signal and the previously extracted original feature speech signal meets a preset requirement; if not, the amplitude of the feature speech signal to be processed is readjusted.
  • step 103 after the step 103 is performed, after the adjusted feature speech signal and other speech signals included in the to-be-processed speech signal are reconstructed to obtain the processed speech signal, in order to ensure and improve the saturation of the speech signal, It can also include the following steps:
  • the frequency distribution of the original speech signal ranges from 200 Hz to 3400 Hz; and the processed speech signal according to the amplitude of the adjusted characteristic speech signal
  • the frequency distribution obtained after the expansion process may range from 50 Hz to 5000 Hz; to improve the saturation of the speech signal.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • the apparatus for improving voice quality provided by this embodiment includes:
  • the voice extraction module 201 is configured to extract a feature voice signal from the to-be-processed voice signal; and extract which feature voice signal can be selected according to the input voice signal condition and the corresponding application scenario, as long as the feature voice signal has a certain representative sexuality and the requirements for subsequent speech reconstruction;
  • the voice processing module 202 is configured to adjust the amplitude of the extracted feature voice signal according to a preset rule; the purpose of the adjustment is to make the amplitude of the feature voice signal within an optimal amplitude range; The amplitude range needs to be selected according to the corresponding application scenario and the amplitude distribution of the currently processed speech signal;
  • the speech reconstruction module 203 is configured to reconstruct the adjusted feature speech signal and other speech signals included in the to-be-processed speech signal to obtain the processed speech signal.
  • the extracted feature speech signal may be a voice pitch signal, or a voice unvoiced signal, or a voice pitch signal and a voice unvoiced signal; and extracting feature voices may be selected according to a corresponding situation; for example, when The voice signal to be processed includes less voice unvoiced signals, or the amplitude of the voice unvoiced signal included therein is very low, and the voice pitch signal included therein is relatively large, and at this time, only the voice pitch signal can be extracted.
  • Performing the above processing which can also improve the speech quality to a certain extent; on the contrary, when the proportion of the speech pitch signal is small, and the proportion of the speech unvoiced signal is much larger than the speech pitch signal, only the voice unvoiced can be extracted.
  • the speech quality can be improved to some extent.
  • the speech pitch signal and the voice unvoiced signal can be extracted for the above processing.
  • the basis for extracting the feature signal is not limited to the above case, and is merely illustrative here.
  • the preset rules used include:
  • the amplitude of the voice pitch signal is smaller than the minimum pitch signal amplitude threshold, it is adjusted to be equal to or greater than the lowest pitch signal amplitude threshold; when the amplitude of the voice pitch signal is greater than the highest pitch signal amplitude threshold, It is adjusted to be less than or equal to the highest pitch signal amplitude threshold.
  • the preset rules used include:
  • the amplitude of the voice unvoiced signal When the amplitude of the voice unvoiced signal is less than the minimum amplitude of the lowest unvoiced signal, it is adjusted to be equal to or greater than the amplitude of the lowest unvoiced signal; when the amplitude of the voice unvoiced signal is greater than the amplitude of the highest unvoiced signal, adjust it Is less than or equal to the highest unvoiced signal amplitude threshold.
  • the apparatus in this embodiment may further include a determining module 204 configured to adjust the amplitude of the feature speech signal after the voice processing module adjusts, and the voice reconstruction module is based on the adjustment.
  • determining module 204 configured to adjust the amplitude of the feature speech signal after the voice processing module adjusts, and the voice reconstruction module is based on the adjustment.
  • the apparatus may further include a voice extension module 205: configured to adjust the adjusted voice signal and the voice reconstruction module 203.
  • the amplitude of the processed speech signal is adjusted according to the amplitude of the characteristic speech signal adjusted by the speech processing module 202 to the processed speech signal.
  • the frequency distribution of the original speech signal ranges from 200 Hz to 3400 Hz; the frequency distribution range obtained by expanding the processed speech signal according to the amplitude of the conditioned speech signal may be 50 Hz-5000 Hz; Increase the saturation of the speech signal.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • a mobile phone is used as an example.
  • the pulse code modulation (PCM) data module of the mobile phone obtains the voice signal of the PCM data format of the mobile phone from the standard PCM interface of the mobile phone as a to-be-processed voice signal as an example.
  • the extracted feature speech signals are a speech unvoiced signal and a speech pitch signal. It is worth noting that when the extracted feature speech signal is a speech unvoiced signal and a speech pitch signal, the adjustment process of the amplitude of the speech unvoiced signal and the speech pitch signal may be performed simultaneously, or the amplitude of the speech pitch signal may be adjusted first.
  • the amplitude of the voice unvoiced signal is adjusted, or the amplitude of the voice unvoiced signal is adjusted, and then the amplitude of the voice pitch signal is adjusted.
  • the adjusted speech pitch signal and the voice unvoiced signal may be combined and reconstructed in combination with other speech signals in the original speech signal.
  • the process includes:
  • Step 501 Acquire a voice signal in a PCM data format as a to-be-processed voice signal.
  • Step 502 Acquire a spectrum feature of the to-be-processed voice signal.
  • Step 503 extract a voice pitch signal and a voice unvoiced signal from the voice signal spectrum in step 502;
  • Step 504 Adjust and control the amplitude of the extracted voice pitch signal according to a set rule, and the adjustment value may be determined according to an empirical value;
  • Step 505 Determine whether the consistency of the adjusted voice pitch signal and the original extracted voice pitch signal meets the requirements, if yes, go to step 508, otherwise, go to step 504;
  • Step 506 Adjust and control the amplitude of the extracted voice unvoiced signal according to a set rule, and the adjustment value may also be determined according to an empirical value;
  • Step 507 determining whether the consistency of the adjusted voice unvoiced signal and the original extracted voice unvoiced signal meets the requirements, if yes, go to step 508, otherwise, go to step 506;
  • Step 508 Synthesize the adjusted voice pitch signal and the adjusted voice unvoiced signal when the consistency of the adjusted voice pitch signal and the adjusted voice unvoiced signal meets the requirements;
  • Step 509 Based on the synthesized voice pitch The signal and the voice unvoiced signal and the original voice signal are reconstructed and expanded according to the voice signal other than the voice pitch signal and the voice unvoiced signal;
  • Step 510 The final PCM data format voice signal is output.
  • the feature speech signal is extracted from the to-be-processed speech signal, and the amplitude thereof is adjusted according to a preset rule to be within a preset amplitude range; and then the original speech signal is processed.
  • the other voice signals are reconstructed and even extended to obtain a voice signal with better voice quality.
  • the method and apparatus for improving voice quality provided by the embodiments of the present invention can improve the quality of a voice signal without increasing the speaker volume or increasing the electrical power input by the speaker, thereby avoiding an increase in speaker volume and an increase in input.
  • Various problems caused by electric power can bring a better experience to users.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)

Abstract

一种提高语音质量的方法及装置,对从待处理语音信号中提取特征语音信号(S101);然后对提取出的特征语音信号的幅值按照预设的规则进行调整(S102)使其在预设的幅值范围内,以保证更好的语音质量;然后将调整后的特征语音信号和待处理语音信号中其他语音信号进行重建得到处理后语音质量更好的语音信号(S103)。

Description

一种提高语音质量的方法及装置
技术领域
本发明涉及语音信号处理领域, 尤其涉及一种提高语音质量的方法及装 置。
背景技术
目前由于手机等移动终端体积及功率限制, 其所釆用的扬声器小, 而且 给扬声器预留的所需的音腔体积也很小; 另外, 目前手机通话绝大多数均是 建立在 CS域的, 受限于核心网的交换电路的带宽, 语音编码算法往往只釆 用 300Hz-3400Hz的语音, 即使宽带语音信号也最多扩展到 6000Hz左右, 而 根据相关技术制造的扬声器及相匹配的音腔设计, 为了提高音量往往会使语 音信号传递到扬声器发声系统时破音, 而且清晰度不够。 目前为了改善终端 扬声器接收语音质量及响度最长釆用的方式是从硬件上提升音质和响度。 这 种做法一方面需要增加扬声器的体积, 扬声器的特点是随着体积的增大, 有 效声辐射功率也会增大, 从而弥补小体积扬声器声辐射功率比较小, 在传输 途中的衰减大的问题, 保证有更多的声音信号进入人耳, 从而来提高通话时 的清晰度和可辨性。 另一方面是提高电路输入到扬声器的电功率, 这样可以 使扬声器以更高的功率工作, 从而弥补声音在传输途中的衰减, 保证有更多 的声音信号进入人耳, 这样也可以提高通话时的清晰度和可辨性。 然而这些 方式均存在很大缺陷, 扬声器体积的增加不但自身会对终端上的空间有更大 要求, 而且相应的音腔也要增加, 否则音质和音量仍然会受影响, 而对于相 关趋向于超薄化发展的手机等移动终端来说, 如此的增加体积要求是无法满 足的; 因此在扬声器体积有限的情况下, 只能通过提高输入扬声器的电功率 来提升音量和音质, 但这样很容易出现扬声器功率过载、 破音甚至损坏扬声 器的情况。
发明内容 本发明实施例提供一种提高语音质量的方法及装置, 解决相关提高语音 质量需增加扬声器体积以及提高输入扬声器的电功率的问题。
本发明实施例提供一种提高语音质量的方法, 包括:
从待处理语音信号中提取特征语音信号;
对提取出的特征语音信号的幅值按照预设规则进行调整; 以及
将调整后的特征语音信号和所述待处理语音信号包括的其他语音信号进 行重建得到处理后的语音信号。
在本发明的一种实施例中, 所述特征语音信号包括语音基音信号和 /或语 音清音信号。
在本发明的一种实施例中, 所述特征语音信号包括语音基音信号时, 所 述预设规则包括:
当语音基音信号的幅值小于最低基音信号幅值阔值时, 将所述语音基音 信号调整为等于或大于所述最低基音信号幅值阔值; 当语音基音信号的幅值 大于最高基音信号幅值阔值时, 将所述语音基音信号调整为小于或等于所述 最高基音信号幅值阔值; 以及
所述特征语音信号包括语音清音信号时, 所述预设规则包括:
当语音清音信号的幅值小于最低清音信号幅值阔值时, 将所述语音清音 信号调整为等于或大于所述最低清音信号幅值阔值; 语音清音信号的幅值大 于最高清音信号幅值阔值时, 将所述语音清音信号调整为小于或等于所述最 高清音信号幅值阔值。
在本发明的一种实施例中, 对所述特征语音信号的幅值进行调整后, 基 于调整后的特征语音信号进行重建前, 还包括: 判断调整后的特征语音信号 与之前提取出的原始特征语音信号的一致性是否满足预设要求; 如否, 则对 所述特征语音信号的幅值重新调整。
在本发明的一种实施例中, 将调整后的特征语音信号和所述待处理语音 信号包括的其他语音信号进行重建得到处理后的语音信号后, 还包括:
根据调整后的所述特征语音信号的幅值对处理后的所述语音信号进行扩 展处理。 本发明实施例还提供了一种提高语音质量的装置, 包括: 语音提取模块, 其设置成从待处理语音信号中提取特征语音信号; 语音处理模块, 其设置成对提取出的特征语音信号的幅值按照预设规则 进行调整; 以及
语音重建模块, 其设置成将调整后的特征语音信号和所述待处理语音信 号包括的其他语音信号进行重建得到处理后的语音信号。
在本发明的一种实施例中, 所述特征语音信号包括语音基音信号和 /或语 音清音信号。
在本发明的一种实施例中, 所述特征语音信号包括语音基音信号时, 所 述预设规则包括:
当语音基音信号的幅值小于最低基音信号幅值阔值时, 将所述语音基音 信号调整为等于或大于所述最低基音信号幅值阔值; 当语音基音信号的幅值 大于最高基音信号幅值阔值时, 将所述语音基音信号调整为小于或等于所述 最高基音信号幅值阔值; 以及
所述特征语音信号包括语音清音信号时, 所述预设规则包括:
当语音清音信号的幅值小于最低清音信号幅值阔值时, 将所述语音清音 信号调整为等于或大于所述最低清音信号幅值阔值; 语音清音信号的幅值大 于最高清音信号幅值阔值时, 将所述语音清音信号调整为小于或等于所述最 高清音信号幅值阔值。
在本发明的一种实施例中, 所述装置还包括判断模块, 所述判断模块设 置成在所述语音处理模块对所述特征语音信号的幅值进行调整后, 所述语音 重建模块基于调整后的特征语音信号进行重建前, 判断调整后的特征语音信 号与所述语音提取模块之前提取出的原始特征语音信号的一致性是否满足预 设要求; 如否, 则通知所述语音处理模块对所述特征语音信号的幅值重新调 整。
在本发明的一种实施例中, 所述装置还包括语音扩展模块, 其设置成在 所述语音重建模块将调整后的特征语音信号和所述待处理语音信号包括的其 他语音信号进行重建得到处理后的语音信号后, 根据所述语音处理模块调整 后的所述特征语音信号的幅值对处理后的所述语音信号进行扩展处理。
本发明实施例的有益效果是:
本发明实施例提供的提高语音质量的方法及装置, 对从待处理语音信号 (例如, 输入扬声器的语音信号) 中提取特征语音信号; 然后对提取出的特 征语音信号的幅值按照预设的规则进行调整使其在预设的幅值范围内, 以保 证更好的语音质量; 然后将调整后的特征语音信号和待处理语音信号中其他 语音信号进行重建得到处理后得到语音质量更好的语音信号; 因此该提高语 音质量的方法及装置可在既不需要增加扬声器体积, 也不需要提高扬声器输 入的电功率的情况下提高语音信号的质量, 可避免增加扬声器体积以及提高 输入电功率导致的各种问题, 可带给用户更好的体验。
附图概述
图 1为本发明实施例 -中提高语 '量的方法的流程示意图
图 2为本发明实施例 .中提高语音 '量的装置的结构示意图
图 3为本发明实施例 .中提高语音 '量的装置的结构示意图.
图 4为本发明实施例 .中提高语音 '量的装置的结构示意图.
图 5为本发明实施例 .中提高语音 '量的方法的流程示意图
本发明的较佳实施方式
下面通过具体实施方式结合附图对本发明作详细说明。 需要说明的是, 在不冲突的情况下, 本申请中的实施例及实施例中的特征可以相互组合。
本发明实施例对从待处理语音信号中提取特征语音信号; 然后对提取出 的特征语音信号的幅值按照预设的规则进行调整使其在预设的幅值范围内;然 后将调整后的特征语音信号和待处理语音信号中其他语音信号进行重建得到 处理后得到语音质量更好的语音信号。 为了更好的理解本发明实施例, 下面结 合附图和各实施例做说明。
实施例一: 请参考图 1 , 本实施例提供的提高语音质量的方法包括:
步骤 101 : 从待处理语音信号中提取特征语音信号;
该步骤中提取哪种特征语音信号可根据输入的语音信号情况以及相应的 应用场景选择设置, 只要该特征语音信号具有一定的代表性以及可满足后续 语音重建的要求即可;
步骤 102: 对提取出的特征语音信号的幅值按照预设规则进行调整; 该步骤主要对提取出的特征语音信号的幅值按照一定的预设规则进行调 整, 使其在最佳的幅值范围内; 该最佳的幅值范围需根据相应的应用场景以 及当前处理的语音信号的幅值分布情况选定设置;
步骤 103 : 将调整后的特征语音信号和待处理语音信号包括的其他语音 信号进行重建得到处理后的语音信号。
该步骤处理后的语音信号与未处理过的语音信号相比, 其包含的特征语 音信号中的至少一种经幅值调整, 因此重建后的语音信号的质量比处理前的 语音信号的质量好; 且这种处理方式并不需要增加扬声器的体积, 也不需要 增加扬声器的输入电功率, 因此也不会导致扬声器功率过载、 破音甚至损坏 扬声器的情况发生。
在本实施例中, 所提取的特征语音信号可以是语音基音信号, 也可以是 语音清音信号, 或者是语音基音信号和语音清音信号; 相应提取哪些特征语 音可根据相应的情况选择设置; 例如, 当待处理的语音信号中, 其包括的语 音清音信号较少, 或者其包括的语音清音信号的幅值都非常低, 而其包括的 语音基音信号则比较多, 此时则可只提取语音基音信号进行上述处理, 这在 一定程度上也能提高语音质量; 相反, 当语音基音信号所占比例较少, 而语 音清音信号所占比例比所述语音基音信号多很多时, 则可只提取语音清音信 号进行上述处理, 也能在一定程度上提高语音质量; 当语音基音信号和语音 清音信号所占比例差不多时, 则可提取语音基音信号和语音清音信号进行上 述处理。 当然, 提取特征信号的依据并不仅限于上述情况, 此处只是作为一 个解释性的说明。
在本实施例中, 当提取的特征语音信号包括语音基音信号时, 釆用的预 设规则包括:
当语音基音信号的幅值小于最低基音信号幅值阔值时, 将其调整为等于 或大于最低基音信号幅值阔值; 当语音基音信号的幅值大于最高基音信号幅 值阔值时, 将其调整为小于或等于所述最高基音信号幅值阔值。
本实施例中, 当提取的特征语音信号包括语音清音信号时, 釆用的预设 规则包括:
语音清音信号的幅值小于最低清音信号幅值阔值时, 将其调整为等于或 大于最低清音信号幅值阔值; 语音清音信号的幅值大于最高清音信号幅值阔 值时, 将其调整为小于或等于最高清音信号幅值阔值。
在本实施例中, 为了保证语音质量, 防止对特征语音信号的幅值进行调 整后导致其失真, 在对特征语音信号的幅值进行调整后, 基于调整后的特征 语音信号进行语音信号的重建前, 还包括: 判断调整后的特征语音信号与之 前提取出的原始特征语音信号的一致性是否满足预设要求; 如否, 则对待处 理的特征语音信号的幅值重新调整。
在本实施例中, 在上述步骤 103之后, 将调整后的特征语音信号和待处 理语音信号包括的其他语音信号进行重建得到处理后的语音信号后, 为了保 证和提高其语音信号的饱和度, 还可包括以下步骤:
根据调整后的特征语音信号的幅值对处理后的语音信号进行扩展处理; 例如, 原语音信号的频率分布范围为 200Hz-3400Hz; 根据调整后的特征语音 信号的幅值对处理后的语音信号进行扩展处理后得到的频率分布范围可能为 50Hz-5000Hz; 以提高该语音信号的饱和度。
实施例二:
请参考图 2, 本实施例提供的提高语音质量的装置包括:
语音提取模块 201 , 其设置成从待处理语音信号中提取特征语音信号; 其提取哪种特征语音信号可根据输入的语音信号情况以及相应的应用场景选 择设置, 只要该特征语音信号具有一定的代表性以及可满足后续语音重建的 要求即可; 语音处理模块 202 , 其设置成对提取出的特征语音信号的幅值按照预设 规则进行调整; 进行调整的目的是使特征语音信号的幅值在最佳的幅值范围 内; 该最佳的幅值范围需根据相应的应用场景以及当前处理的语音信号的幅 值分布情况选定设置;
语音重建模块 203 , 其设置成将调整后的特征语音信号和待处理语音信 号包括的其他语音信号进行重建得到处理后的语音信号。
在本实施例中, 所提取的特征语音信号可以是语音基音信号, 也可以是 语音清音信号, 或者是语音基音信号和语音清音信号; 提取哪些特征语音可 根据相应的情况选择设置; 例如, 当待处理的语音信号中, 其包括的语音清 音信号较少, 或者其包括的语音清音信号的幅值都非常低, 而其包括的语音 基音信号则比较多, 此时则可只提取语音基音信号进行上述处理, 这在一定 程度上也能提高语音质量; 相反, 当语音基音信号所占比例较少, 而语音清 音信号所占比例比所述语音基音信号多很多时, 则可只提取语音清音信号进 行上述处理, 也能在一定程度上提高语音质量; 当语音基音信号和语音清音 信号所占比例差不多时, 则可提取语音基音信号和语音清音信号进行上述处 理。 当然, 提取特征信号的依据并不仅限于上述情况, 此处只是作为一个解 释性的说明。
在本实施例中, 当提取的特征语音信号包括语音基音信号时, 釆用的预 设规则包括:
当语音基音信号的幅值小于最低基音信号幅值阔值时, 将其调整为等于 或大于最低基音信号幅值阔值; 当语音基音信号的幅值大于最高基音信号幅 值阔值时, 将其调整为小于或等于所述最高基音信号幅值阔值。
本实施例中, 当提取的特征语音信号包括语音清音信号时, 釆用的预设 规则包括:
语音清音信号的幅值小于最低清音信号幅值阔值时, 将其调整为等于或 大于最低清音信号幅值阔值; 语音清音信号的幅值大于最高清音信号幅值阔 值时, 将其调整为小于或等于最高清音信号幅值阔值。
在本实施例中, 为了保证语音质量, 防止对特征语音信号的幅值进行调 整后导致其失真,请参见图 3所示,本实施例中的装置还可包括判断模块 204, 其设置成在语音处理模块对特征语音信号的幅值进行调整后, 语音重建模块 基于调整后的特征语音信号进行重建前, 判断调整后的特征语音信号与语音 提取模块之前提取出的原始特征语音信号的一致性是否满足预设要求;如否, 通知语音处理模块 202对待处理的特征语音信号的幅值重新调整。
在本实施例中, 为了保证和提高其语音信号的饱和度, 请参见图 4所示, 该装置还可包括语音扩展模块 205: 其设置成在语音重建模块 203将调整后 的特征语音信号和待处理语音信号包括的其他语音信号进行重建得到处理后 的语音信号后, 根据语音处理模块 202调整后的特征语音信号的幅值对处理 后的语特征语音信号的幅值对处理后的语音信号进行扩展处理; 例如, 原语 音信号的频率分布范围为 200Hz-3400Hz; 根据整后的特征语音信号的幅值对 处理后的语音信号进行扩展处理后得到的频率分布范围可能为 50Hz-5000Hz; 以提高该语音信号的饱和度。
实施例三:
为了更好的理解本发明实施例, 下面结合一个具体的应用场景为例进行 说明。
本实施例以手机为例, 手机的脉冲编码调制 (PCM )数据模块从手机的 标准 PCM接口获取手机下行的 PCM数据格式的语音信号作为待处理语音信 号为例进行说明。 在本实施例, 所提取的特征语音信号为语音清音信号和语 音基音信号。 值得注意的是, 当提取的特征语音信号为语音清音信号和语音 基音信号时, 对语音清音信号和语音基音信号幅值的调整过程可同时进行, 也可先对语音基音信号幅值调整后再对语音清音信号幅值进行调整, 或者先 对语音清音信号的幅值调整后, 再对语音基音信号幅值进行调整。 对语音重 建时, 也可先对调整后的语音基音信号和语音清音信号合成后, 在结合原语 音信号中的其他语音信号进行重建。
请参见图 5所示, 该处理过程包括:
步骤 501 : 获取 PCM数据格式的语音信号作为待处理语音信号; 步骤 502: 获取该待处理语音信号的频谱特征;
步骤 503 : 从步骤 502 中的语音信号频谱中提取出语音基音信号和语音 清音信号;
步骤 504: 对提取出的语音基音信号的幅值按照设定的规则进行调整控 制, 调整值可根据经验值确定;
步骤 505 : 判断调整后的语音基音信号与原始提取出的语音基音信号的 一致性是否满足要求, 如满足, 转至步骤 508 , 否则, 转至步骤 504;
步骤 506: 对提取出的语音清音信号的幅值按照设定的规则进行调整控 制, 调整值也可根据经验值确定;
步骤 507 : 判断调整后的语音清音信号与原始提取出的语音清音信号的 一致性是否满足要求, 如满足, 转至步骤 508 , 否则, 转至步骤 506;
步骤 508: 当调整后的语音基音信号和调整后的语音清音信号的一致性 都满足要求时, 将调整后的语音基音信号和调整后的语音清音信号合成; 步骤 509: 基于合成后的语音基音信号和语音清音信号和原语音信号中 除语音基音信号和语音清音信号外的其他语音信号进行重建、 扩展处理; 步骤 510: 将最终得到的 PCM数据格式的语音信号输出。
可见, 本发明实施例通过从待处理语音信号中提取特征语音信号, 对其 幅值按照预设的规则进行调整使其在预设的幅值范围内后; 再将其与原待处 理语音信号中其他语音信号进行重建、 甚至扩展可得到语音质量更好的语音 信号。
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序 来指令相关硬件完成, 所述程序可以存储于计算机可读存储介质中, 如只读 存储器、 磁盘或光盘等。 可选地, 上述实施例的全部或部分步骤也可以使用 一个或多个集成电路来实现。 相应地, 上述实施例中的各模块 /单元可以釆用 硬件的形式实现, 也可以釆用软件功能模块的形式实现。 本发明不限制于任 何特定形式的硬件和软件的结合。
以上内容是结合具体的实施方式对本发明所作的详细说明, 不能认定本 发明的具体实施只局限于这些说明。 对于本发明所属技术领域的普通技术人 员来说, 在不脱离本发明构思的前提下, 还可以做出若干简单推演或替换, 都应当视为属于本发明的保护范围。
工业实用性 本发明实施例提供的提高语音质量的方法及装置可在既不需要增加扬声 器体积, 也不需要提高扬声器输入的电功率的情况下提高语音信号的质量, 可避免增加扬声器体积以及提高输入电功率导致的各种问题, 可带给用户更 好的体验。

Claims

权 利 要 求 书
1、 一种提高语音质量的方法, 包括:
从待处理语音信号中提取特征语音信号;
对提取出的特征语音信号的幅值按照预设规则进行调整; 以及
将调整后的特征语音信号和所述待处理语音信号包括的其他语音信号进 行重建得到处理后的语音信号。
2、 如权利要求 1所述的提高语音质量的方法, 其中, 所述特征语音信号 包括语音基音信号和 /或语音清音信号。
3、 如权利要求 2所述的提高语音质量的方法, 其中, 所述特征语音信号 包括语音基音信号时, 所述预设规则包括:
当语音基音信号的幅值小于最低基音信号幅值阔值时, 将所述语音基音 信号调整为等于或大于所述最低基音信号幅值阔值; 当语音基音信号的幅值 大于最高基音信号幅值阔值时, 将所述语音基音信号调整为小于或等于所述 最高基音信号幅值阔值; 以及
所述特征语音信号包括语音清音信号时, 所述预设规则包括:
当语音清音信号的幅值小于最低清音信号幅值阔值时, 将所述语音清音 信号调整为等于或大于所述最低清音信号幅值阔值; 当语音清音信号的幅值 大于最高清音信号幅值阔值时, 将所述语音清音信号调整为小于或等于所述 最高清音信号幅值阔值。
4、 如权利要求 1-3任一项所述的提高语音质量的方法, 其中, 对所述特 征语音信号的幅值进行调整后, 基于调整后的特征语音信号进行重建前, 还 包括: 判断调整后的特征语音信号与之前提取出的原始特征语音信号的一致 性是否满足预设要求; 如否, 则对所述特征语音信号的幅值重新调整。
5、 如权利要求 1-3任一项所述的提高语音质量的方法, 其中, 将调整后 的特征语音信号和所述待处理语音信号包括的其他语音信号进行重建得到处 理后的语音信号后, 还包括:
根据调整后的所述特征语音信号的幅值对处理后的所述语音信号进行扩 展处理。
6、 一种提高语音质量的装置, 包括: 语音提取模块, 其设置成从待处理语音信号中提取特征语音信号; 语音处理模块, 其设置成对提取出的特征语音信号的幅值按照预设规则 进行调整; 以及
语音重建模块, 其设置成将调整后的特征语音信号和所述待处理语音信 号包括的其他语音信号进行重建得到处理后的语音信号。
7、 如权利要求 6所述的提高语音质量的装置, 其中, 所述特征语音信号 包括语音基音信号和 /或语音清音信号。
8、 如权利要求 7所述的提高语音质量的装置, 其中,
所述特征语音信号包括语音基音信号时, 所述预设规则包括:
当语音基音信号的幅值小于最低基音信号幅值阔值时, 将所述语音基音 信号调整为等于或大于所述最低基音信号幅值阔值; 当语音基音信号的幅值 大于最高基音信号幅值阔值时, 将所述语音基音信号调整为小于或等于所述 最高基音信号幅值阔值; 以及
所述特征语音信号包括语音清音信号时, 所述预设规则包括:
当语音清音信号的幅值小于最低清音信号幅值阔值时, 将所述语音清音 信号调整为等于或大于所述最低清音信号幅值阔值; 当语音清音信号的幅值 大于最高清音信号幅值阔值时, 将所述语音清音信号调整为小于或等于所述 最高清音信号幅值阔值。
9、 如权利要求 6-8任一项所述的提高语音质量的装置, 其中, 所述装置 还包括判断模块, 所述判断模块设置成在所述语音处理模块对所述特征语音 信号的幅值进行调整后, 所述语音重建模块基于调整后的特征语音信号进行 重建前, 判断调整后的特征语音信号与所述语音提取模块之前提取出的原始 特征语音信号的一致性是否满足预设要求; 如否, 则通知所述语音处理模块 对所述特征语音信号的幅值重新调整。
10、 如权利要求 6-8任一项所述的提高语音质量的装置, 其中, 所述装 置还包括语音扩展模块, 其设置成在所述语音重建模块将调整后的特征语音 信号和所述待处理语音信号包括的其他语音信号进行重建得到处理后的语音 信号后, 根据所述语音处理模块调整后的所述特征语音信号的幅值对处理后 的所述语音信号进行扩展处理。
PCT/CN2014/071868 2013-10-23 2014-02-07 一种提高语音质量的方法及装置 WO2014161388A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310503510.6A CN104575515A (zh) 2013-10-23 2013-10-23 一种提高语音质量的方法及装置
CN201310503510.6 2013-10-23

Publications (1)

Publication Number Publication Date
WO2014161388A1 true WO2014161388A1 (zh) 2014-10-09

Family

ID=51657554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/071868 WO2014161388A1 (zh) 2013-10-23 2014-02-07 一种提高语音质量的方法及装置

Country Status (2)

Country Link
CN (1) CN104575515A (zh)
WO (1) WO2014161388A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105225674B (zh) * 2015-09-25 2019-02-15 维沃移动通信(杭州)有限公司 一种语音信号处理方法、装置及移动终端
CN105355197B (zh) * 2015-10-30 2020-01-07 百度在线网络技术(北京)有限公司 用于语音识别系统的增益处理方法及装置
CN106205629A (zh) * 2016-07-04 2016-12-07 广东小天才科技有限公司 一种声音制作方法及装置
CN106340306A (zh) * 2016-11-04 2017-01-18 厦门盈趣科技股份有限公司 一种提高语音识别度的方法及装置
CN108922558B (zh) * 2018-08-20 2020-11-27 广东小天才科技有限公司 一种语音处理方法、语音处理装置及移动终端
CN109671448B (zh) * 2018-12-29 2021-05-18 联想(北京)有限公司 一种数据处理方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07146700A (ja) * 1993-11-24 1995-06-06 Hitachi Ltd ピッチ強調方法および装置ならびに聴力補償装置
CN101658052A (zh) * 2007-03-21 2010-02-24 弗劳恩霍夫应用研究促进协会 用于音频重构增强的方法和设备
CN102436807A (zh) * 2011-09-14 2012-05-02 苏州思必驰信息科技有限公司 自动生成重读音节语音的方法和系统
CN102779527A (zh) * 2012-08-07 2012-11-14 无锡成电科大科技发展有限公司 基于窗函数共振峰增强的语音增强方法
CN102780948A (zh) * 2011-05-11 2012-11-14 富士通株式会社 风噪声抑制器、半导体集成电路和风噪声抑制方法
CN103236263A (zh) * 2013-03-27 2013-08-07 东莞宇龙通信科技有限公司 一种改善通话质量的方法、系统及移动终端
CN103262577A (zh) * 2010-12-08 2013-08-21 唯听助听器公司 助听器和增强语音重现的方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2056110C (en) * 1991-03-27 1997-02-04 Arnold I. Klayman Public address intelligibility system
US5870705A (en) * 1994-10-21 1999-02-09 Microsoft Corporation Method of setting input levels in a voice recognition system
EP0994464A1 (fr) * 1998-10-13 2000-04-19 Koninklijke Philips Electronics N.V. Procédé destiné à génére un signal large bande a partir d'un signal en bande étroite, appareil pour realiser un tel procédé et equipement téléphonique comportant un tel appareil
CN101699837B (zh) * 2009-10-30 2012-04-25 华为终端有限公司 一种电话语音输出增益调节的方法、装置和通信终端

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07146700A (ja) * 1993-11-24 1995-06-06 Hitachi Ltd ピッチ強調方法および装置ならびに聴力補償装置
CN101658052A (zh) * 2007-03-21 2010-02-24 弗劳恩霍夫应用研究促进协会 用于音频重构增强的方法和设备
CN103262577A (zh) * 2010-12-08 2013-08-21 唯听助听器公司 助听器和增强语音重现的方法
CN102780948A (zh) * 2011-05-11 2012-11-14 富士通株式会社 风噪声抑制器、半导体集成电路和风噪声抑制方法
CN102436807A (zh) * 2011-09-14 2012-05-02 苏州思必驰信息科技有限公司 自动生成重读音节语音的方法和系统
CN102779527A (zh) * 2012-08-07 2012-11-14 无锡成电科大科技发展有限公司 基于窗函数共振峰增强的语音增强方法
CN103236263A (zh) * 2013-03-27 2013-08-07 东莞宇龙通信科技有限公司 一种改善通话质量的方法、系统及移动终端

Also Published As

Publication number Publication date
CN104575515A (zh) 2015-04-29

Similar Documents

Publication Publication Date Title
US9208766B2 (en) Computer program product for adaptive audio signal shaping for improved playback in a noisy environment
WO2014161388A1 (zh) 一种提高语音质量的方法及装置
EP2695394B1 (en) Integrated psychoacoustic bass enhancement (pbe) for improved audio
WO2013107307A1 (zh) 降噪方法及设备
CN112565981B (zh) 啸叫抑制方法、装置、助听器及存储介质
CN108922558B (zh) 一种语音处理方法、语音处理装置及移动终端
CN111383647B (zh) 语音信号处理方法及装置、可读存储介质
JP6381062B2 (ja) 通信デバイスのための音声信号を処理するための方法及びデバイス
EP4165882A1 (en) Audio enhancement for hearing impaired in a shared listening environment
TWI543634B (zh) 處理聲音段之方法及其電腦程式產品及助聽器
WO2016095683A1 (zh) 一种消除tdd噪声的方法和装置
TWI624183B (zh) 電話語音處理之方法及其電腦程式
TW201317983A (zh) 增進語音即時輸出之方法及助聽器
CN107750038B (zh) 音量调节方法、装置、设备及存储介质
US10374566B2 (en) Perceptual power reduction system and method
KR20120016709A (ko) 휴대용 단말기에서 통화 품질을 향상시키기 위한 장치 및 방법
US10748548B2 (en) Voice processing method, voice communication device and computer program product thereof
CN111048107B (zh) 音频处理方法和装置
CN110120226B (zh) 一种专网集群终端语音尾噪消除方法和设备
WO2016197801A1 (zh) 一种通话处理方法、装置及终端
CN111405419B (zh) 音频信号处理方法、装置及可读存储介质
US20140372110A1 (en) Voic call enhancement
TWI566240B (zh) 音訊處理方法
US20110019828A1 (en) Apparatus and method for sound enhancer

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14778589

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14778589

Country of ref document: EP

Kind code of ref document: A1