CN106157978B

CN106157978B - Speech signal processing apparatus and speech signal processing method

Info

Publication number: CN106157978B
Application number: CN201510177492.6A
Authority: CN
Inventors: 杜博仁; 张嘉仁; 曾凯盟
Original assignee: Acer Inc
Current assignee: Acer Inc
Priority date: 2015-04-15
Filing date: 2015-04-15
Publication date: 2020-04-07
Anticipated expiration: 2035-04-15
Also published as: CN106157978A

Abstract

The invention provides a voice signal processing device and a voice signal processing method. The invention calculates the value of the interpolation parameter function corresponding to the sampling signal window according to three continuous sampling values in the sampling signal window, and calculates the interpolation value between two adjacent sampling points in the frequency reduction signal window according to the value of the interpolation parameter function. The invention can effectively avoid the situation that the voice signal after frequency reduction has signal distortion.

Description

Speech signal processing apparatus and speech signal processing method

Technical Field

The present invention relates to a signal processing apparatus, and more particularly, to a speech signal processing apparatus and a speech signal processing method.

Background

Generally, for hearing impaired people, it is often impossible to clearly receive a higher frequency voice signal, such as a sub-voice signal, but can be clearly heard for a low frequency signal. However, after the signal is down-converted, the time length is long, and the signal value between two consecutive sampling signals needs to be obtained by interpolation, for example, when the audio signal is down-converted from a high frequency signal to a low frequency signal having only half the frequency, the time length is doubled, and a new signal between the sampling signal and the sampling signal needs to be obtained by interpolation. Since the characteristics of the sound signal are relatively close to the sinusoidal wave, if the interpolated signal value is obtained by a general arithmetic mean method, the signal after frequency reduction is often distorted.

Disclosure of Invention

The invention provides a voice signal processing device and a voice signal processing method, which can effectively avoid the situation that the voice signal after frequency reduction has signal distortion.

The voice signal processing device comprises a processing unit which receives a sampled voice signal of a sampled signal window comprising a sequence, calculates the value of an interpolation parameter function corresponding to each sampled signal window according to three continuous sampling values in each sampled signal window, down-converts the sampled voice signal to generate a down-converted signal of the down-converted signal window comprising the sequence, and calculates the interpolation value between two adjacent sampling points in each down-converted signal window according to the value of the interpolation parameter function corresponding to each down-converted signal window.

In an embodiment of the invention, the voice signal processing apparatus further includes a sampling unit, coupled to the processing unit, for sampling the original voice signal to generate a sampled voice signal, and the processing unit further determines whether a value of the interpolation parameter function is smaller than an upper limit value and greater than or equal to a lower limit value, and corrects the value of the interpolation parameter function if the value of the interpolation parameter function is not smaller than the upper limit value or is not greater than or equal to the lower limit value.

In an embodiment of the present invention, if the value of the interpolation parameter function is greater than or equal to the upper limit value, the value of the interpolation parameter function is corrected to the upper limit value, and if the value of the interpolation parameter function is smaller than the lower limit value, the value of the interpolation parameter function is corrected to the lower limit value.

In an embodiment of the invention, the upper limit value and the lower limit value are associated with a frequency of an original speech signal and a sampling frequency of a sampling unit.

In an embodiment of the invention, the processing unit further calculates an interpolation parameter function corresponding to each sampling signal window according to a trigonometric function relationship between three consecutive sampling values in each sampling signal window.

In an embodiment of the invention, the interpolation parameter function is a trigonometric function.

The speech signal processing method of the present invention includes the following steps. An original speech signal is sampled to produce a sampled speech signal comprising a sequence of windows of sampled signals. And calculating the value of the interpolation parameter function corresponding to each sampling signal window according to three continuous sampling values in each sampling signal window. The speech signal is down-sampled to produce a down-converted signal comprising a sequence of down-converted signal windows. And calculating the interpolation value between two adjacent sampling points in each frequency reduction signal window according to the value of the interpolation parameter function corresponding to each frequency reduction signal window.

In an embodiment of the invention, the method further includes determining whether a value of the interpolation parameter function is smaller than an upper limit and greater than or equal to a lower limit, and modifying the value of the interpolation parameter function if the value of the interpolation parameter function is not smaller than the upper limit or is not greater than or equal to the lower limit.

In an embodiment of the present invention, the upper limit value and the lower limit value are associated with a frequency of the original speech signal and a sampling frequency of the sampling unit.

In an embodiment of the invention, the voice signal processing method includes calculating an interpolation parameter function corresponding to each sampling signal window according to a trigonometric function relationship between three consecutive sampling values in each sampling signal window.

Based on the above, the embodiment of the invention calculates the value of the interpolation parameter function corresponding to the sampling signal window according to the three consecutive sampling values in the sampling signal window, and calculates the interpolation value between two adjacent sampling points in the down-converted signal window according to the value of the interpolation parameter function, so as to obtain an accurate interpolation value, thereby effectively avoiding the situation of signal distortion of the down-converted voice signal.

In order to make the aforementioned and other features and advantages of the invention more comprehensible, embodiments accompanied with figures are described in detail below.

Drawings

FIG. 1 is a schematic diagram of a speech signal processing apparatus according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a down converted signal according to an embodiment of the present invention;

fig. 3 is a flow chart of a speech signal processing method according to an embodiment of the invention.

Description of reference numerals:

102: a processing unit;

104: a sampling unit;

s1: an original speech signal;

s2: sampling a voice signal;

s3: frequency-reducing signals;

wm, Wm + 1: a down-conversion signal window;

s (2n), s (2n +2), s (2n +4), s (2n +6), s (2n + 8): sampling points;

s (2n +1), s (2n +3), s (2n +5), s (2n + 7): interpolation points;

s302 to S312: and (5) carrying out the following steps.

Detailed Description

Fig. 1 is a schematic diagram of a speech signal processing apparatus according to an embodiment of the invention, and fig. 1 is a schematic diagram. The speech signal processing apparatus includes a processing unit 102 and a sampling unit 104, wherein the processing unit 102 is coupled to the sampling unit 104, the processing unit 102 can be implemented by, for example, a central processing unit, and the sampling unit 104 can be implemented by, for example, a logic circuit, but not limited thereto. The sampling unit 104 may sample the original speech signal S1 to generate a sampled speech signal S2, wherein the sampled speech signal S2 includes a sequence of sampled signal windows. The processing unit 102 may calculate a value of an interpolation parameter function corresponding to each sampling window according to three consecutive sampling values in each sampling window, and may further down-sample the speech signal S2 to generate a down-converted signal including a sequence of down-converted signal windows, and calculate an interpolation value between two adjacent sampling points in each down-converted signal window according to the value of the interpolation parameter function corresponding to each down-converted signal window, where the interpolation parameter function is a trigonometric function, such as a sine function or a cosine function, but not limited thereto.

For example, fig. 2 is a schematic diagram of a down-converting signal according to an embodiment of the invention, please refer to fig. 2. In fig. 2, the solid dots are sampling points of the sampling unit 104, and the hollow dots are interpolation points calculated by the processing unit 102. It is assumed that the sample value n at the middle time point of the m-th sample signal window in the sampled speech signal S2 is

Wherein m is a positive integer and n is 0 or a positive integer. In addition, in the present embodiment, the frequency of the down-converted signal S3 obtained by down-converting the sampled voice signal S2 is half of the frequency of the sampled voice signal S2, and if the sampling value at the time point n in the mth down-converted signal window Wm (which corresponds to the mth sampling signal window of the sampled voice signal S2) in the down-converted signal S3 is assumed to be S_m(n)Then, the corresponding relationship between the same sampling point before and after the down-conversion can be shown as follows:

the processing unit 102 may calculate an interpolation parameter function corresponding to each sampling window, for example, an interpolation parameter function C corresponding to the m-th sampling window, according to three consecutive sampling values in each sampling window_m(g) Three sampling values which can be sampled consecutively in a sampling signal window according to the sampling unit 104

And

the interpolation parameter function corresponding to the time range of the sampling signal window can be obtained by the trigonometric function relationship as follows:

wherein g is 0 or a positive integer, C_m(g) For interpolating a function value of a parameter function at a time point g, interpolating the parameter function C_m(g) Is a trigonometric function.

Since the speech signal processing apparatus may generate noise during the signal processing, the calculated value of the interpolation parameter function contains noise components, which affects the accuracy of the processing unit 102 for obtaining the interpolation value. The processing unit 102 may check whether the value of the interpolation parameter function is interfered by noise by determining whether the value of the interpolation parameter function falls within a predetermined range, for example, whether the value of the interpolation parameter function is smaller than an upper limit and greater than or equal to a lower limit, if the value of the interpolation parameter function is not smaller than the upper limit or not greater than or equal to the lower limit, it represents that the value of the interpolation parameter function is interfered by noise, and the processing unit 102 may correct the value of the interpolation parameter function to remove noise components included in the value of the interpolation parameter function. For example, if the value of the parameter function is interpolatedIf the value of the interpolation parameter function is smaller than the lower limit value, the processing unit 102 may modify the value of the interpolation parameter function to the lower limit value, and if the value of the interpolation parameter function is smaller than the upper limit value and greater than or equal to the lower limit value, the value of the interpolation parameter function does not need to be modified. For example, in the embodiment of FIG. 2, the parameter function C is interpolated_m(g) The correction of the value of (a) can be expressed by the following equation:

that is, the upper limit value and the lower limit value are 1 and 0.5 respectively in the embodiment of fig. 2, if the voice signal processing apparatus is affected by noise during the signal processing, the interpolation parameter function C is caused to be_m(g) If the value of (C) is greater than or equal to 1, the processing unit 102 interpolates the parameter function C_m(g) Is corrected to 1, and if the parameter function C is interpolated_m(g) Is less than 0.5, the processing unit 102 interpolates the parameter function C_m(g) The value of (d) is corrected to 0.5. It is to be noted that the upper limit value and the lower limit value of the formula (3) are only exemplary embodiments, and are not limited thereto. The upper and lower limits may be adjusted according to actual noise interference, for example, the upper and lower limits may be adjusted according to the frequency of the original speech signal and the sampling frequency of the sampling unit.

After obtaining the value of the interpolation parameter function, the processing unit 102 may calculate an interpolation value between two adjacent sampling points in the down-converted signal window according to the interpolation parameter function. Taking the embodiment of fig. 2 as an example, the interpolation point s (2n +1) between the sampling points s (2n), s (2n +2) of the sampling unit 104 and the interpolation point s (2n +3) between the sampling points s (2n +2), s (2n +4) in the down-converted signal window Wm can be respectively expressed as follows:

in the formulas (4) and (5), n is 0 or a positive even number. Similarly, the interpolation values between the sampling points in other down-conversion signal windows can be obtained in the same manner, for example, the interpolation point s (2n +5) between the sampling points s (2n +4) and s (2n +6) and the interpolation point s (2n +7) between the sampling points s (2n +6) and s (2n +8) in the down-conversion signal window Wm +1 of fig. 2 can also be obtained in the embodiment of fig. 2, and those skilled in the art should be able to derive the implementation manner according to the method of the above embodiment, and thus, the description thereof is omitted.

As described above, in the present embodiment, the trigonometric function is used to estimate the interpolation value between the sampling points, and the interpolation value between two adjacent sampling points in the down-converted signal window is calculated according to the interpolation parameter function, because the characteristics of the trigonometric function are similar to the characteristics of the sound signal, compared with the prior art that the interpolation value is calculated by simply using the arithmetic mean, the calculation method of the present embodiment can obtain a more accurate interpolation value, and can effectively avoid the signal distortion of the down-converted speech signal.

Fig. 3 is a flowchart illustrating a voice signal processing method according to an embodiment of the invention, please refer to fig. 3. As can be seen from the above embodiments, the method for processing a speech signal of a speech signal processing apparatus may include the following steps. First, an original speech signal is sampled to generate a sampled speech signal including a sequence of windows of the sampled signal (step S302). Then, the value of the interpolation parameter function corresponding to each sampling signal window is calculated according to three consecutive sampling values in each sampling signal window (step S304), wherein the interpolation parameter function can be calculated according to the trigonometric function relationship between three consecutive sampling values in each sampling signal window, and the interpolation parameter function can be a trigonometric function. Then, it is determined whether the value of the interpolation parameter function is smaller than the upper limit and greater than or equal to the lower limit (step S306), and if the value of the interpolation parameter function is not smaller than the upper limit or not greater than or equal to the lower limit, the value of the interpolation parameter function is modified (step S308) to remove unnecessary noise. The upper limit value and the lower limit value may be adjusted according to the actual noise interference, for example, the upper limit value and the lower limit value may be adjusted according to the frequency of the original speech signal and the sampling frequency of the sampling unit, and the value of the interpolation parameter function may be corrected, for example, in such a manner that the value of the interpolation parameter function is corrected to the upper limit value when the value of the interpolation parameter function is greater than or equal to the upper limit value, and the value of the interpolation parameter function is corrected to the lower limit value when the value of the interpolation parameter function is less than the lower limit value. After the values of the interpolation parameter function are corrected, the voice signal is down-sampled to generate a down-converted signal including a sequence of down-converted signal windows (step S310), and then an interpolation value between two adjacent sampling points in each down-converted signal window is calculated according to the value of the interpolation parameter function corresponding to each down-converted signal window (step S312). Conversely, if the value of the interpolation parameter function is smaller than the upper limit value and greater than or equal to the lower limit value, the process proceeds directly to step S310 to down-sample the speech signal.

In summary, the embodiments of the present invention utilize the trigonometric function to estimate the interpolation value between the sampling points, that is, calculate the interpolation value between two adjacent sampling points in the down-converted signal window according to the interpolation parameter function, because the characteristics of the trigonometric function are similar to the characteristics of the audio signal, compared with the prior art, a more accurate interpolation value can be obtained, and the situation of signal distortion of the down-converted audio signal can be effectively avoided.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A speech signal processing apparatus, comprising:

a processing unit for receiving a sampled speech signal comprising a sequence of sampled signal windows, calculating a value of an interpolation parameter function corresponding to each of the sampled signal windows according to a trigonometric function relationship between three consecutive sampled values in each of the sampled signal windows, the interpolation parameter function being a trigonometric function, down-converting the sampled speech signal to generate a down-converted signal comprising a sequence of down-converted signal windows, calculating an interpolated value between two adjacent sampled points in each of the down-converted signal windows according to the value of the interpolation parameter function corresponding to each of the down-converted signal windows,

wherein the interpolation parameter function C corresponding to the m-th sampling signal window_m(g) According to three sampling values sampled continuously in the m-th sampling signal window

And

the interpolation parameter function is as follows:

wherein g is 0 or a positive integer, C_m(g) The function value of the interpolation parameter function at time point g.

2. The speech signal processing apparatus according to claim 1, further comprising:

and the processing unit is coupled with the processing unit and used for sampling the original voice signal to generate the sampled voice signal, judging whether the value of the interpolation parameter function is smaller than an upper limit value and larger than or equal to a lower limit value, and correcting the value of the interpolation parameter function if the value of the interpolation parameter function is not smaller than the upper limit value or not larger than or equal to the lower limit value.

3. The speech signal processing apparatus according to claim 2, wherein the value of the interpolation parameter function is corrected to the upper limit value if the value of the interpolation parameter function is equal to or greater than the upper limit value, and the value of the interpolation parameter function is corrected to the lower limit value if the value of the interpolation parameter function is less than the lower limit value.

4. The speech signal processing apparatus according to claim 3, wherein the upper limit value and the lower limit value are associated with a frequency of the original speech signal and a sampling frequency of the sampling unit.

5. A speech signal processing method, comprising:

sampling an original speech signal to produce a sampled speech signal comprising a sequence of sampled signal windows;

calculating the value of an interpolation parameter function corresponding to each sampling signal window according to the trigonometric function relation among three continuous sampling values in each sampling signal window, wherein the interpolation parameter function is a trigonometric function, and the interpolation parameter function C corresponding to the m-th sampling signal window_m(g) According to three sampling values sampled continuously in the m-th sampling signal window

And

the trigonometric function relationship between the two is obtained;

down-converting the sampled speech signal to produce a down-converted signal comprising a sequence of down-converted signal windows; and

calculating the interpolation value between two adjacent sampling points in each down-converting signal window according to the value of the interpolation parameter function corresponding to each down-converting signal window,

wherein the interpolation parameter function is represented by:

6. The speech signal processing method according to claim 5, further comprising:

and judging whether the value of the interpolation parameter function is smaller than an upper limit value and larger than or equal to a lower limit value, and if the value of the interpolation parameter function is not smaller than the upper limit value or not larger than or equal to the lower limit value, correcting the value of the interpolation parameter function.

7. The speech signal processing method according to claim 6, wherein the value of the interpolation parameter function is corrected to the upper limit if the value of the interpolation parameter function is equal to or greater than the upper limit, and the value of the interpolation parameter function is corrected to the lower limit if the value of the interpolation parameter function is less than the lower limit.

8. The speech signal processing method according to claim 7, wherein the upper limit value and the lower limit value are associated with a frequency of the original speech signal and a sampling frequency of a sampling unit.