CN1215875A

CN1215875A - Pitch raising and lowering device and method for digital audio signal

Info

Publication number: CN1215875A
Application number: CN 97119196
Authority: CN
Inventors: 陈文源
Original assignee: Winbond Electronics Corp
Current assignee: Winbond Electronics Corp
Priority date: 1997-10-28
Filing date: 1997-10-28
Publication date: 1999-05-05
Anticipated expiration: 2017-10-28
Also published as: CN1091917C

Abstract

A tone-lifting device for digital audio signal is used to receive a digital audio signal and convert the frequency of the digital audio signal. The device mainly comprises an input device for receiving digital audio signals, a tone processing device for selecting samples with specific length in the digital audio signals to perform tone lifting processing and obtain a frequency conversion sound frame, and an audio signal connecting device for connecting the frequency conversion sound frame and the frequency conversion audio signals so as to obtain new frequency conversion audio signals. The audio signal linking device comprises a searching area potential comparing device, a linking area potential comparing device, a bit processing device and a linking device.

Description

Pitch raising and lowering device and method for digital audio signal

本发明涉及一种数字音频信号之音调升降(pitch shift)装置及方法，且特别是涉及一种在不定音框分割法(unlimited audio framedivision method)之音调升降装置中，用以快速搜寻(fast searching)两音框间衔接点之方法。The present invention relates to a device and method for pitch shifting of digital audio signals, and in particular to a device for fast searching in a pitch shifting device of an unlimited audio frame division method. ) The method of connecting points between two sound frames.

在数字音频信号之音调升降处理中，最简单的做法就是将放音的频率加快(音调升高)或变慢(音调降低)，如同将唱盘转速加快或减慢的道理一样。但是，这种做法却会改变原始音频信号的播放时间长度。因此，要如何保有原始音频信号的播放时间长度，并同时达到音调升降的效果，便成为音调升降处理中之首要课题。In the pitch raising and lowering processing of digital audio signals, the simplest method is to speed up (pitch up) or slow down (pitch down) the playback frequency, just like speeding up or slowing down the rotation speed of a turntable. However, doing so changes the playing time length of the original audio signal. Therefore, how to keep the playing time of the original audio signal and at the same time achieve the effect of pitch raising and lowering has become the primary issue in the pitch raising and lowering processing.

陈思平先生在其硕士论文″MPEG解码、音高调整及次频编码之音讯处理演算法研究(On Audio Processing For MPEG Decoding,Pitch-shifting and Sub-band Coding)″中曾提出一种不定音框分割法，其实施情形大致如下：Mr. Chen Siping proposed an indeterminate frame in his master thesis "On Audio Processing For MPEG Decoding, Pitch-shifting and Sub-band Coding" The implementation of the segmentation method is roughly as follows:

一、首先由原始音频信号中取出一个样本数N之音框(audioframe)；1. Firstly, a sound frame (audioframe) with a sample number N is taken out from the original audio signal;

二、对上述音框内之样本进行音调升降处理；2. Perform pitch up-down processing on the samples in the above-mentioned sound frame;

三、假设升降频后的音框放音长度变为mN(m＞1为降音；m＜1为升音)，则决定次一个音框为自原始音框中第mN个样本点开始的N个样本；3. Assuming that the playback length of the sound frame after the frequency reduction becomes mN (m>1 is a falling sound; m<1 is a rising sound), then the next sound frame is determined to start from the mNth sample point in the original sound frame N samples;

四、对新音框中各样本进行步骤二中所述之音调升降处理；4. Carry out the pitch raising and lowering processing described in step 2 for each sample in the new sound frame;

五、找出最佳衔接点(optimum connecting point)以衔接上述两个音框；5. Find the optimal connecting point (optimum connecting point) to connect the above two sound frames;

六、假设衔接后音框的样本数为2mN-X,X为因衔接而舍去的样本数，则再次一个音框为自原始音频信号中第2mN-X个样本点开始的N个样本；以及6. Assuming that the number of samples of the sound frame after concatenation is 2mN-X, and X is the number of samples discarded due to concatenation, then again a sound frame is N samples starting from the 2mN-X sample point in the original audio signal; as well as

七、重覆步骤四至步骤六，直到音调升降处理结束。7. Repeat step 4 to step 6 until the pitch up and down process ends.

在这种不定音框分割法之音调升降处理中，步骤五之衔接点搜寻及衔接方式乃是利用前一个音框的末段资料(即后文中所谓搜寻区(search region))与次一个音框的前段资料(即后文中所谓衔接区(cross region))进行数值对比(以平均绝对值差(mean absolutelyerror)MAE做为判断最佳衔接点的依据)，藉以得到最相似之衔接方式，其参考方程式如下： $MAE (i) = Σ_{j = 0}^{M - 1} | C (j) - S (i + j) | i = 0 ~ N - M$ In the pitch raising and lowering processing of this infinitum frame segmentation method, the search and connection method of the joint point in step five is to use the last segment data of the previous frame (the so-called search region (search region) in the following text) and the next tone The data in the previous section of the box (that is, the so-called cross region in the following text) is numerically compared (the mean absolute error (MAE) is used as the basis for judging the best connection point), so as to obtain the most similar connection method. The reference equation is as follows: $MAE (i) = Σ_{j = 0}^{m - 1} | C (j) - S (i + j) | i = 0 ~ N - m$

其中，C为衔接区样本，样本数为M；S为搜寻区样本，样本数为N且N＞M：而最佳衔接点则是能够使MAE具有最小值之样本，即衔接区样本与搜寻区样本最相似之位置。Among them, C is the sample in the connection area, the number of samples is M; S is the sample in the search area, the number of samples is N and N>M: and the best connection point is the sample that can make the MAE have the minimum value, that is, the sample in the connection area and the search area The location of the most similar area samples.

至于衔接动作则依据下列公式进行： $P (i + j) = \frac{j}{M} * C (j) + \frac{M - j}{M} * S (i + j) j = 0 ~ M - 1$ As for the linking action, it is carried out according to the following formula: $P (i + j) = \frac{j}{m} * C (j) + \frac{m - j}{m} * S (i + j) j = 0 ~ m - 1$

其中，i为最佳衔接点的位置：P为衔接的样本，而P之后则是次一音框的样本。Among them, i is the position of the best cohesion point: P is the cohesive sample, and after P is the sample of the next frame.

请参考图1，此为不定音框分割法之数字音频信号在降音情形下之音调升降处理示意图。Please refer to FIG. 1 , which is a schematic diagram of the pitch raising and lowering processing of the digital audio signal of the infinitum segmentation method in the case of pitch reduction.

在此图例中，假设原始音频信号S0为一组由多个连续数字信号所构成的数字音频信号。首先，自此原始音频信号S0中选出时间长度L1之一组音框D1，如图中标示0至L1-1之间的样本。然后，对此音框D1进行音调降音处理(如改变其播放速度)，则放音的时间变长为L2，藉以得到一组时间长度L2之降频音频信号D1′。In this illustration, it is assumed that the original audio signal S0 is a set of digital audio signals composed of a plurality of continuous digital signals. First, a group of sound frames D1 with a time length L1 is selected from the original audio signal S0 , as samples between 0 and L1-1 are marked in the figure. Then, the tone down processing is performed on the sound frame D1 (such as changing its playback speed), and the playback time becomes longer as L2, so as to obtain a set of down-frequency audio signals D1' with a time length of L2.

紧接着，再以原始音频信号S0中上一个音框D1起点(即标示0之时间)后距离L2时间长度(即降频后之音框时间长度)处，往后选取时间长度L1之一组音框D2，如图中标示L1至L1+L2-1之间的样本。并以同样方法对音框D2进行音调降音处理，籍以得到一组时间长度L2之降频音框D2′。Next, take the distance L2 from the starting point of the previous sound frame D1 in the original audio signal S0 (that is, the time marked 0) (that is, the time length of the sound frame after frequency reduction), and then select a group of time length L1 Sound frame D2, the samples between L1 and L1+L2-1 are marked in the figure. In the same way, the pitch-down processing is performed on the sound frame D2 to obtain a set of down-frequency sound frames D2' with a time length L2.

接着，进行降频音频信号D1′及D2′的衔接动作。首先，选取降频音频信号D1′后段之部分样本及原始音频信号S0与其后段邻接之部分样本为搜寻区Sa，以及降频音框D2′前段之部分样本为衔接区Ca。然后，对比衔接区Ca与搜寻区Sa中各样本，并依据上述数学式得到降频音框D1′与D2′间的最佳衔接点K1，藉以完成衔接的动作并得到一降频音频信号S0′。Next, the splicing operation of the down-converted audio signals D1' and D2' is performed. Firstly, some samples of the back section of the down-converted audio signal D1' and some samples adjacent to the back section of the original audio signal S0 are selected as the search area Sa, and some samples of the front section of the down-frequency audio frame D2' are selected as the connection area Ca. Then, compare the samples in the connection area Ca and the search area Sa, and obtain the optimal connection point K1 between the down-frequency sound frames D1' and D2' according to the above mathematical formula, so as to complete the operation of the connection and obtain a down-frequency audio signal S0 '.

然后，重覆上述步骤直到整个降频处理结束。Then, the above steps are repeated until the whole down-frequency processing ends.

至于图2则表示不定音框法之数字音频信号在升音情形下之音调升降处理示意图。As for FIG. 2 , it shows a schematic diagram of the pitch raising and lowering processing of the digital audio signal of the infinitum method in the case of raising the pitch.

在图例中，假设原始音频信号S1为一组由多个连续数字信号所构成之数字音频信号。首先，自原始音频信号S1中选出时间长度L3之一组音框D3，如图中标示0至L3-1之间的样本。然后，对此音框D3进行音调升音处理(如加快其播放速度)，则放音的时间缩短为L4，藉以得到一组时间长度L4之升频音频信号D3′。In the illustration, it is assumed that the original audio signal S1 is a set of digital audio signals composed of a plurality of continuous digital signals. First, a group of sound frames D3 with a time length L3 is selected from the original audio signal S1, as samples between 0 and L3-1 are marked in the figure. Then, the pitch-up processing is performed on the sound frame D3 (such as speeding up its playing speed), and the playing time is shortened to L4, so as to obtain a set of up-frequency audio signals D3' with a time length of L4.

紧接着，再以原始音频信号S1中上一个音框D3起点(即标示0之时间)后距离L4长度(即升频后之音框时间长度)处，往后选取时间长度L3之一组音框D4，如图中标示L3至L3+L4-1之间的样本。并以同样方法对音框D4进行音调升音处理，藉以得到一组样本数L4之升频音框D4′。Next, use the distance L4 from the starting point of the last sound frame D3 in the original audio signal S1 (i.e. the time marked 0) (i.e. the time length of the sound frame after up-conversion), and then select a group of sounds with a time length L3 Box D4, the samples between L3 and L3+L4-1 are marked in the figure. In the same way, the tone-up processing is performed on the sound frame D4, so as to obtain a set of up-frequency sound frame D4' with the number of samples L4.

接着，进行升频音频信号D3′及升频音框D4′的衔接动作。Next, the splicing operation of the up-converted audio signal D3' and the up-converted audio frame D4' is performed.

首先，选取升频音频信号D3′后段之部分样本及原始音频信号S1与其后段邻接之部分样本为搜寻区Sb，以及升频音框D4′前段之部分样本为衔接区Cb。然后，对比衔接区Cb与搜寻区Sb中各样本，并依据上述数学式得到升频音频信号D3′与升频音框D4′间的最佳衔接点K2，藉以完成衔接的动作并得到一升频音频信号S1′。Firstly, select some samples of the rear part of the up-converted audio signal D3' and some samples adjacent to the rear part of the original audio signal S1 as the search area Sb, and select some samples of the front part of the up-converted sound frame D4' as the connection area Cb. Then, compare the samples in the connection area Cb and the search area Sb, and obtain the optimal connection point K2 between the up-conversion audio signal D3' and the up-conversion sound frame D4' according to the above mathematical formula, so as to complete the operation of connection and obtain a liter frequency audio signal S1'.

以及，重覆上述步骤直到升频处理完成。And, repeat the above steps until the upscaling process is completed.

不过，这种不定音框分割法之音调升降处理在N=160、M=80、放音频为16千赫的情况下，每10毫秒必须进行(80+79)*80=12720次加减运算(即每秒钟至少要进行1272000次加减运算)，这样的运算成本实在是太大了。因此，要将不定音框分割法应用于消费性产品中，便必需找到一种简易有效之衔接点搜寻方法，且该方法要能以简单的硬件线路完成。However, in the case of N=160, M=80, and the playback audio frequency is 16 kHz for the infinitum frame segmentation method, (80+79)*80=12720 addition and subtraction operations must be performed every 10 milliseconds (That is, at least 1,272,000 addition and subtraction operations are performed per second), the cost of such operations is really too high. Therefore, in order to apply the infinitum frame segmentation method to consumer products, it is necessary to find a simple and effective method for searching connection points, and the method should be completed with simple hardware circuits.

有鉴于此，本发明的主要目的就是在提供一种数字音频信号之音调升降装置及方法，其利用位元运算以简化两音框间衔接点的搜寻，因此可大量减少运算成本及以简易硬件线路以完成。In view of this, the main purpose of the present invention is to provide a pitch raising and lowering device and method for a digital audio signal, which utilizes bit operations to simplify the search for the connection point between two sound frames, so that the calculation cost can be greatly reduced and the simple hardware can be used. line to complete.

为达到本发明之上述及其他目的，本发明乃提出一种数字音频信号之音调升降方法，用以接收一数字音频信号，并对其频率进行转换，藉以得到一变频音频信号。其步骤包括：首先，选取数字音频信号中特定长度R之样本为第一原始音框进行频率转换，变频后，时间长度由原来L变成L′，藉以得到时间长度L′之第一音框做为变频音频信号。然后，自该原始数字音频信号中L′时间处开始选取R个样本为第二原始音框进行频率转换，藉以得到一新时间长度L′之第二变频音框。然后，衔接该第二变频音框及该变频音频信号，藉以得到更新后之变频音频信号。以及，重覆步骤前两个步骤，藉以完成该数字音频信号之音调升降。In order to achieve the above-mentioned and other objects of the present invention, the present invention proposes a method for raising and lowering the pitch of a digital audio signal, which is used to receive a digital audio signal and convert its frequency to obtain a variable-frequency audio signal. The steps include: first, select a sample of a certain length R in the digital audio signal as the first original sound frame for frequency conversion, after frequency conversion, the time length is changed from the original L to L', so as to obtain the first sound frame of the time length L' As a frequency conversion audio signal. Then, select R samples from the time L' of the original digital audio signal as the second original sound frame for frequency conversion, so as to obtain a second frequency-converted sound frame with a new time length L'. Then, concatenate the second frequency-converted sound frame and the frequency-converted audio signal to obtain an updated frequency-converted audio signal. And, repeating the first two steps of the step, so as to complete the pitch raising and lowering of the digital audio signal.

其中，前述衔接之步骤包括：首先，自变频信号后段选取长度N之样本为搜寻区，并将搜寻区中各样本与参考电位比较，藉以得到一搜寻区位元资料(bit sequence)，表示该搜寻区中各样本之振幅情况。然后，自第二变频音框前段选取长度M之样本做为衔接区(通常N＞M)，并将衔接区各样本与参考电位比较，藉以得到一衔接区位元资料，表示衔接区中各样本之振幅情况。最后，以位元比较方式比较衔接区及搜寻区中所有长度M之子搜寻区的位元资料，藉以得到对应之不相似度(non-similarity)；以及，将衔接区及对应之子搜寻区(sub-search region)中衔接，藉以得到更新后之变频音频信号。Wherein, the aforementioned steps of connection include: firstly, select a sample of length N from the rear section of the frequency conversion signal as the search area, and compare each sample in the search area with a reference potential, so as to obtain a bit sequence of the search area, indicating the The amplitude of each sample in the search area. Then, select a sample of length M from the front section of the second frequency conversion frame as the connection area (usually N>M), and compare each sample in the connection area with the reference potential, so as to obtain a bit data of the connection area, indicating that each sample in the connection area The amplitude condition. Finally, the bit data of all sub-search areas of length M in the linking area and the search area are compared by bit comparison, so as to obtain the corresponding non-similarity; and, the linking area and the corresponding sub-search area (sub -search region) to obtain the updated frequency conversion audio signal.

此外，本发明中衔接区及搜寻区中各长度M之子搜寻区的位元比较可以XOR逻辑电路完成，而其对应之不相似度则是XOR逻辑电路输出结果中1的数目。In addition, in the present invention, the bit comparison of the sub-search areas of length M in the connection area and the search area can be completed by an XOR logic circuit, and the corresponding degree of dissimilarity is the number of 1s in the output result of the XOR logic circuit.

另外，本发明亦提供一种数字音频信号之音调升降装置，用以接收一数字音频信号，并对其频率进行转换。其主要包括一输入装置、一音调处理装置及一音频信号衔接装置。输入装置是用来接收数字音频信号。音调处理装置是用来选取数字音频信号中特定长度之样本以进行音调升降处理及得到一变频音框。而音频信号衔接装置则是将变频音框与变频音频信号衔接，藉以得到新的变频音频信号。又，音频信号衔接装置可包括：一搜寻区电位比较装置，以变频音频信号后段长度为N之样本为搜寻区样本进行比较，藉以得到一搜寻区样本之位元资料；一衔接区电位比较装置，以变频音框前段长度为M之样本为衔接区样本进行比较，藉以得到一衔接区样本之位元资料；一位元处理装置，对比衔接区样本之位元资料及搜寻区样本中所有长度M之样本所组成之子搜寻区样本的位元资料，藉以得到对应之不相似度；以及一衔接装置，选取最小不相似度之子搜寻区样本以进行与变频音框与变频音频信号之衔接。In addition, the present invention also provides a pitch raising and lowering device for a digital audio signal, which is used to receive a digital audio signal and convert its frequency. It mainly includes an input device, a tone processing device and an audio signal connecting device. The input device is used to receive digital audio signals. The pitch processing device is used to select samples of a specific length in the digital audio signal to perform pitch-lowering processing and obtain a frequency-changing sound frame. The audio signal connection device is to connect the frequency conversion sound frame with the frequency conversion audio signal, so as to obtain a new frequency conversion audio signal. Also, the audio signal connection device may include: a search area potential comparison device, which compares samples with a length of N in the back segment of the frequency conversion audio signal as the search area sample, so as to obtain bit data of a search area sample; a connection area potential comparison The device is used to compare the samples with the length M of the front section of the frequency conversion frame as the samples of the connection area, so as to obtain the bit data of a sample of the connection area; the one-bit processing device compares the bit data of the sample of the connection area with all the samples in the search area. The bit data of the sub-search area samples formed by the samples of length M, so as to obtain the corresponding dissimilarity; and a connection device, which selects the sub-search area samples with the smallest dissimilarity to perform connection with the frequency conversion sound frame and the frequency conversion audio signal.

为让本发明之上述和其他目的、特征、和优点能更明显易懂，下文特举一较佳实施例，并配合附图，作详细说明如下，图中：In order to make the above-mentioned and other purposes, features, and advantages of the present invention more obvious and understandable, a preferred embodiment is specifically cited below, and in conjunction with the accompanying drawings, the detailed description is as follows. In the figure:

图1为不定音框分割法之数字音频信号在降音情形下之音调升降处理示意图；Fig. 1 is a schematic diagram of the pitch up and down processing of the digital audio signal of the infinitum frame segmentation method in the case of pitch down;

图2为不定音框分割法之数字音频信号在升音情形下之音调升降处理示意图；Fig. 2 is a schematic diagram of the pitch up and down processing of the digital audio signal of the infinitum frame segmentation method in the case of pitch up;

图3A及图3B分别为本发明数字音频信号之音调升降装置及方法中搜寻区样本及衔接区样本之示意图；3A and 3B are schematic diagrams of the search area sample and the transition area sample in the pitch raising and lowering device and method of the digital audio signal of the present invention;

图4为本发明数字音频信号之音调升降装置之电路方块图及其处理方法之示意图：Fig. 4 is the circuit block diagram and the schematic diagram of the processing method thereof of the tone raising and lowering device of digital audio signal of the present invention:

图5为本发明数字音频信号之音调升降方法中音调升降之处理示意图。FIG. 5 is a schematic diagram of pitch raising and lowering processing in the pitch raising and lowering method of a digital audio signal according to the present invention.

承上所述，由于不定音框分割法之音调升降处理在见有方法中是利用平均绝对值差(即上述数学式)做为衔接点搜寻之依据，因此，必须花费相当大的运算成本及硬件成本方能达到衔接点的搜寻。Continuing from the above, because the pitch-lowering process of the infinitum segmentation method uses the average absolute value difference (that is, the above mathematical formula) as the basis for the search for the connection point in the known method, it must spend a considerable amount of computing costs and The cost of hardware can only achieve the search of the connection point.

此外，在数字音频信号之音调处理中，由于一音框的时间长度非常小(通常在20至30毫秒之间)，音框内的资料具有统计上不变的特性(stationary)。并且，由于两相邻音框内各样本所组成之波形形状、振幅、大小均很类似，且音调信息的差异常来自各样本组成之波形的上下振幅，因此，衔接点便可以仅针对搜寻区及衔接区之样本组成之波形的上下振幅进行对比而得到。In addition, in the pitch processing of digital audio signals, since the duration of a sound frame is very small (usually between 20 and 30 milliseconds), the data in the sound frame is statistically stationary. And, because the shape, amplitude, and size of the waveforms formed by the samples in two adjacent sound frames are very similar, and the difference in pitch information often comes from the upper and lower amplitudes of the waveforms formed by the samples, therefore, the connection point can only be aimed at the search area. It is obtained by comparing the upper and lower amplitudes of the waveform composed of samples in the connecting area.

而本发明便根据这种特性以位元比较的方式提供一种数字音频信号之音调升降装置及方法，其仅就搜寻区及衔接区之上下摆动进行对比，故可以大量减少音框间衔接点搜寻所花费的运算、硬件成本。According to this characteristic, the present invention provides a device and method for pitch raising and lowering of digital audio signals in a bit-by-bit comparison mode, which only compares the up and down swings of the search area and the articulation area, so the articulation points between sound frames can be greatly reduced. The calculation and hardware cost of searching.

请参考图5，此为本发明数字音频信号之音调升降方法中音调升降之处理示意图，用以接收一数字音频信号，并对其频率进行转换，藉以得到一变频音频信号。Please refer to FIG. 5 , which is a schematic diagram of pitch up/down processing in the pitch up/down method of a digital audio signal according to the present invention, which is used to receive a digital audio signal and convert its frequency to obtain a variable frequency audio signal.

首先，选取该数字音频信号中特定长度之样本为第一原始音框进行频率转换，藉以得到一时间长度L6之第一音框做为该变频音频信号。Firstly, a sample of a specific length in the digital audio signal is selected as the first original sound frame for frequency conversion, so as to obtain a first sound frame with a time length L6 as the frequency-converted audio signal.

在图例中，假设原始音频信号S2为一组由多个连续数字信号所构成之数字音频信号，如图1及图2中所示。首先，自原始音频信号S2中选出时间长度L5(即图中标示0～L5-1之间的样本)之音框做为第一原始音框D5。然后，对第一原始音框D5进行音调升降处理(如加快或放慢其播放速度)，则放音的时间缩短或增长成为L6，藉以得到一组时间长度L6之第一音框D5′做为所要的变频音频信号S2′。In the illustration, it is assumed that the original audio signal S2 is a set of digital audio signals composed of a plurality of continuous digital signals, as shown in FIG. 1 and FIG. 2 . First, a sound frame with a time length L5 (that is, samples between 0 and L5-1 marked in the figure) is selected from the original audio signal S2 as the first original sound frame D5. Then, the first original sound frame D5 is subjected to pitch up-and-down processing (such as speeding up or slowing down its playback speed), then the time of playing the sound is shortened or increased to become L6, so as to obtain the first sound frame D5' of a group of time length L6. is the desired frequency conversion audio signal S2'.

接着，选取该数字音频信号中自该第一原始音框时间L6处(即变频后第一音框之时间长度)开始之L5时间长度之样本(即图中标示L6～L5+L6-1之间的样本)为第二原始音框D6进行音调升降处理，藉以得到一组时间长度L6之第二变频音框D6′。Then, select the sample of the L5 time length starting from the first original sound frame time L6 (i.e. the time length of the first sound frame after frequency conversion) in the digital audio signal (that is, the one marked L6～L5+L6-1 in the figure Samples in between) are pitched and lowered for the second original sound frame D6, so as to obtain a set of second frequency-converted sound frames D6' with a time length L6.

然后，进行变频音频信号D5′及变频音框D6′的衔接。Then, the frequency conversion audio signal D5' and the frequency conversion sound frame D6' are connected.

不同于现有的是，本发明中利用位元比较的方式减低了现有数值对比的复杂度。Different from the existing ones, the present invention reduces the complexity of existing value comparisons by using bit comparison.

请参考图3A及图3B，其分别用来代表本发明数字音频信号之音调升降方法中搜寻区及衔接区样本的示意图。其中，搜寻区Sc可以是从变频音频信号D5′后段及所邻接之数字音框S2中选取之N个样本。而衔接区Cc则可以是从第二变频音框D6′前段所选取之M个样本。Please refer to FIG. 3A and FIG. 3B , which are respectively used to represent the schematic diagrams of the samples of the search area and the transition area in the pitch raising and lowering method of the digital audio signal of the present invention. Wherein, the search area Sc may be N samples selected from the rear segment of the frequency-converted audio signal D5' and the adjacent digital sound frame S2. The concatenation area Cc may be M samples selected from the front section of the second frequency conversion sound frame D6'.

且，在这个步骤中，搜寻区Sc之所以包括部分邻接数字音频信号乃是为了寻找一合理的衔接点。并藉由数个变频音框的衔接，如D5′和D6′的衔接，使变频后整体放音时间更逼近原数字音频信号S2。Moreover, in this step, the reason why the search area Sc includes part of adjacent digital audio signals is to find a reasonable connection point. And through the connection of several frequency conversion sound frames, such as the connection of D5' and D6', the overall playback time after frequency conversion is closer to the original digital audio signal S2.

又，请参考图4，此为本发明数字音频信号之音调升降方法的音调处理示意图。其中，为了节省运算及硬件成本，首先，将搜寻区Sc及衔接区Cc内之样本经过一组电位比较装置20、30进行电位比较，藉以使大于参考电位Vref(如0V)之样本输出″1″，而小于参考电位之样本则输出″0″，分别代表搜寻区Sc及衔接区Cc之样本的上下振幅情况及分别做为搜寻区位元资料Sd及衔接区位元资料Cd。Also, please refer to FIG. 4 , which is a schematic diagram of the pitch processing of the pitch raising and lowering method of the digital audio signal according to the present invention. Among them, in order to save computing and hardware costs, first, the samples in the search area Sc and the connection area Cc are compared in potential through a set of potential comparison devices 20, 30, so that the samples that are greater than the reference potential Vref (such as 0V) output "1" ", and the samples that are lower than the reference potential will output "0", which respectively represent the upper and lower amplitudes of the samples in the search area Sc and the connection area Cc, and are respectively used as the search area bit data Sd and the link area bit data Cd.

然后，以位元比较方式比较衔接区Cc及搜寻区Sc中各具有衔接区Cc长度之子搜寻区Ssub的位元资料。并计算该XOR逻辑电路之处理结果中代表衔接区Cc及各子搜寻区Ssub之位元资料相异的位元数目，藉以做为衔接区Cc及各子搜寻区Ssub所对应之不相似度。在本实施例中，衔接区样本Cc及各子搜寻区样本Ssub的对比方式可以用一个XOR运算装置完成，即计算XOR运算所得结果中代表1之位元数(即位元资料相异之数目)，用以做为其对应之不相似度。Then, compare the bit data of the sub-search area Ssub in the linking area Cc and the search area Sc each having the length of the linking area Cc in a bit comparison manner. And calculate the number of different bits representing the bit data of the connection area Cc and each sub-search area Ssub in the processing result of the XOR logic circuit, and use it as the degree of dissimilarity corresponding to the connection area Cc and each sub-search area Ssub. In this embodiment, the comparison of the connecting area sample Cc and each sub-search area sample Ssub can be completed by using an XOR operation device, that is, calculating the number of bits representing 1 in the result of the XOR operation (that is, the number of different bit data) , used as its corresponding dissimilarity.

最后，将衔接区Cc与子搜寻区Ssub中对应最小不相似度之子搜寻区衔接(例如以一计数装置计算该XOR逻辑电路之处理结果中1的数目做为各子搜寻区对应之不相似度，以及以一数值比较器比较各不相似度，藉以求出最小不相似度所对应之子搜寻区及其对应之样本K)，藉以得到新的变频音频信号S2’；以及，重覆上述步骤以完成整个音调升降处理。Finally, join the sub-search area corresponding to the minimum dissimilarity in the sub-search area Cc with the sub-search area Ssub (for example, use a counting device to calculate the number of 1s in the processing result of the XOR logic circuit as the dissimilarity corresponding to each sub-search area , and compare the degrees of dissimilarity with a numerical comparator to find the sub-search region corresponding to the minimum degree of dissimilarity and its corresponding sample K), so as to obtain a new frequency-converted audio signal S2'; and, repeat the above steps to Complete the entire pitch up and down process.

在此方法中，由于数字音频信号细分为长度约20～30毫秒之音框，且衔接区Cc及子搜寻区Ssub之对比是以一电位比较装置完成，故电路的成本可以减低，并可实用于一般音调处理装置中。In this method, since the digital audio signal is subdivided into sound frames with a length of about 20-30 milliseconds, and the comparison between the connection area Cc and the sub-search area Ssub is completed by a potential comparison device, the cost of the circuit can be reduced, and the Applicable to general tone processing devices.

另外，本发明亦提供一数字音频信号之音调装置，其用以接收由多个样本依序组成之一数字音频信号，并将之转换成一变频音频信号。该装置主要包括：一输入装置、一音调处理装置及一音频信号衔接装置。输入装置是用来接收数字音频信号。音调处理装置是用以选取数字音频信号中一特定长度之样本进行音调升降处理，藉以得到一变频音框。而音频信号衔接装置则将变频音框衔接于变频音频信号，藉以得到新的变频音频信号。In addition, the present invention also provides a digital audio signal tone device, which is used to receive a digital audio signal composed of a plurality of samples sequentially, and convert it into a frequency-changed audio signal. The device mainly includes: an input device, a tone processing device and an audio signal connecting device. The input device is used to receive digital audio signals. The pitch processing device is used to select samples of a specific length in the digital audio signal to perform pitch up-down processing, so as to obtain a frequency-changed sound frame. The audio signal connection device connects the frequency conversion sound frame to the frequency conversion audio signal, so as to obtain a new frequency conversion audio signal.

其中，音频信号衔接装置包括：一搜寻区电位比较装置20、一衔接区电位比较装置30、一位元处理单元40及一衔接装置50。Wherein, the audio signal connection device includes: a search area potential comparison device 20 , a connection area potential comparison device 30 , a bit processing unit 40 and a connection device 50 .

搜寻区电位比较装置20是自变频音频信号S2′后段及其邻接之数字音频信号S2中选取特定长度，如N个样本做为搜寻区进行电位比较，藉以得到一搜寻区位元资料Sd，对应于该搜寻区Sc内各样本之振幅。如：自音框D5′后段及邻接之数字音频信号S2中选取长度N之样本做为搜寻区，并以一电位比较装置20将之与一参考电位，如0V进行比较，藉以得到一音框D5′之位元资料。The search area potential comparison device 20 is to select a specific length from the rear segment of the frequency conversion audio signal S2' and its adjacent digital audio signal S2, such as N samples as the search area for potential comparison, so as to obtain a search area bit data Sd, corresponding to The amplitude of each sample in the search area Sc. Such as: select a sample of length N from the back section of the sound frame D5' and the adjacent digital audio signal S2 as the search area, and use a potential comparison device 20 to compare it with a reference potential, such as 0V, so as to obtain a sound Bit data for box D5'.

而衔接区电位比较装置30则自第二变频音频信号D6′前段选取长度M之样本做为衔接区进行电位比较，藉以得到一衔接区位元资料Cd，对应于该衔接区Cc内各样本之振幅情况。如：以一电位比较装置30比较一参考电位，如0V及音框D6′前段长度M之样本，藉以得到一音框D6′之位元资料。And the connection area potential comparison device 30 selects samples of length M from the front section of the second frequency conversion audio signal D6' as the connection area for potential comparison, so as to obtain a connection area bit data Cd, corresponding to the amplitude of each sample in the connection area Cc Condition. For example: use a potential comparison device 30 to compare a reference potential, such as 0V, with samples of the length M of the front section of the sound frame D6', so as to obtain the bit data of a sound frame D6'.

位元处理装置40是用来对比该衔接区Cc及该搜寻区Sc中各长度M之样本所组成之子搜寻区Ssub的位元资料，藉以得到一对应之不相似度。如，利用一XOR逻辑电路依序对比音框D5′及D6′之位元资料中各位元之异同，并计算该结果中位元″1″或″0″的数目(如利用计数器达成)，藉以做为对应之不相似度。The bit processing device 40 is used to compare the bit data of the sub-search area Ssub formed by the samples of length M in the connection area Cc and the search area Sc, so as to obtain a corresponding degree of dissimilarity. For example, using an XOR logic circuit to sequentially compare the similarities and differences of each bit in the bit data of the sound frames D5' and D6', and calculate the number of bit "1" or "0" in the result (such as using a counter to achieve), as the corresponding degree of dissimilarity.

而衔接装置50则找出最小不相似度对应之子搜寻区及其对应之样本k，并将该变频音框D6′衔接于该变频音频信号D5′，藉以得到该新的变频音频信号S2′。如，以比较器比较所有不相似度，并从中得到一具有最小不相似度之子搜寻区样本，然后，以这个子搜寻区样本之起点(如样本点K)为音框D5′及D6′之衔接点，藉以衔接两变频音框。The concatenating device 50 finds the sub-search area corresponding to the minimum dissimilarity and the corresponding sample k, and concatenates the frequency-converted sound frame D6' to the frequency-converted audio signal D5', so as to obtain the new frequency-converted audio signal S2'. For example, compare all dissimilarities with a comparator, and obtain a sub-search area sample with the minimum dissimilarity therefrom, then, use the starting point (such as sample point K) of this sub-search area sample as the sound frame D5' and D6' The connection point is used to connect the two frequency conversion sound frames.

综上所述，本发明数字音频信号之音调处理装置及方法可应用简易电路便完成不定音框分割法之音调升降处理中，各音框间之衔接，其减少运算及硬件的成本至多，可运用于一般音调处理系统中。In summary, the pitch processing device and method of the digital audio signal of the present invention can apply a simple circuit to complete the pitch raising and lowering processing of the indeterminate frame segmentation method. Used in general tone processing systems.

虽然本发明已以较佳实施例揭露如上，然其并非用以限定本发明，任何熟习此技术者，在不脱离本发明之精神和范围内，当可做更动与修改，因此本发明之保护范围当视所附之权利要求书所界定者为准。Although the present invention has been disclosed as above with preferred embodiments, it is not intended to limit the present invention. Anyone skilled in the art can make changes and modifications without departing from the spirit and scope of the present invention. Therefore, the present invention The scope of protection should be defined by the appended claims.

Claims

1. A method for pitch raising and lowering of a digital audio signal is used to receive a digital audio signal and convert its frequency to obtain a variable frequency audio signal. It is characterized in that the method includes the following steps:

(a) Select a sample of a certain length R in the digital audio signal as the first original sound frame for frequency conversion, and the time length of the first original sound frame changes from L to L' after frequency conversion, so as to obtain a time length L' The first sound frame is used as the frequency conversion audio signal;

(b) performing frequency conversion with samples of a specific length R starting at the time L' in the original digital audio signal as the second original sound frame, so as to obtain a second frequency-changed sound frame of a time length L';

(c) connecting the second frequency conversion sound frame and the frequency conversion audio signal, so as to obtain the updated frequency conversion audio signal; and

(d) repeat steps (b), (c), so as to complete the pitch up and down of the digital audio signal;

Wherein, step (c) comprises:

Select a sample of length N from the back section of the frequency conversion signal and its adjacent digital audio signal as a search area, and compare each sample in the search area with a reference potential to obtain a search area bit data, indicating the search The amplitude of each sample in the area;

Select a sample of length M from the front section of the second frequency conversion sound frame as a linking area, and compare each sample in the linking area with the reference potential to obtain bit data of the linking area, indicating the amplitude of each sample in the linking area ;

Comparing the bit data of the linking area and all the sub-search areas of length M in the search area by means of bit comparison, so as to obtain the degree of dissimilarity corresponding to the linking area and the sub-search areas; and

The joining area and the sub-search area corresponding to the minimum dissimilarity among the sub-search areas are joined to obtain an updated frequency-converted audio signal.

2. The pitch raising and lowering method according to claim 1, wherein, the step of forming the degree of dissimilarity corresponding to the search area and the sub-search intervals comprises:

Comparing the bit data of the linking area and the sub-search areas of length M in the search area by means of bit comparison, so as to obtain bit data of dissimilarity; and

Calculate the number of bits representing bit differences in the dissimilarity bit data, and use it as the dissimilarity corresponding to the linking area and the sub-search area.

3. The pitch raising and lowering method as claimed in claim 1, wherein the length N of the search region samples is greater than the length M of the transition region samples.

4. The pitch raising and lowering method as claimed in claim 3, wherein the search area samples are composed of the next N samples of the frequency conversion audio signal.

5. The pitch raising and lowering method as claimed in claim 1, wherein the step of bit-comparing the search region sample and all the sub-search buffer frames composed of consecutive samples of length M in the link region sample is completed by an XOR step.

6. The pitch raising method as claimed in claim 5, wherein the degree of dissimilarity is represented by the number of "1" bits in the result of the XOR step.

7. A pitch raising and lowering device for a digital audio signal is used to receive a digital audio signal and convert its frequency to obtain a variable frequency audio signal. It is characterized in that the pitch raising and lowering device includes:

an input device for receiving the digital audio signal;

a pitch processing device, which is used to select samples of a specific length in the digital audio signal to perform pitch up-down processing, so as to obtain a frequency-changed sound frame; and

An audio signal connection device, which connects the frequency conversion sound frame to the frequency conversion audio signal, so as to obtain the updated frequency conversion audio signal:

Wherein, the audio signal connection device includes:

A search area potential comparison device, which selects a sample of length N from the rear segment of the frequency conversion audio signal and its adjacent digital audio signal as the search area for potential comparison, so as to obtain a search area bit data, indicating the value of each sample in the search area Amplitude condition;

A linking area potential comparison device, which selects a sample of length M from the front section of the frequency conversion sound frame as the linking area for potential comparison, so as to obtain a linking area bit element data, indicating the amplitude of each sample in the linking area;

a one-bit processing device for comparing the bit data of the sub-search area samples formed by the samples of length M in the joint area and the search area, so as to obtain the degree of dissimilarity corresponding to the joint area and the sub-search area; and

A concatenating device concatenates the concatenated area with the sub-search area corresponding to the minimum dissimilarity among the sub-search areas, so as to obtain the updated frequency-converted audio signal.

8. The pitch raising and lowering device as claimed in claim 7, wherein the potential comparing means of the connection area uses a reference potential for potential comparison.

9. The pitch raising and lowering device as claimed in claim 7, wherein the search area potential comparing means uses a reference potential for potential comparison.

10. The pitch raising and lowering device according to claim 8 or 9, wherein the reference potential is 0V.

11. The pitch raising and lowering device as claimed in claim 7, wherein the bit processing device is a logic circuit for performing XOR operation.

12. The pitch raising and lowering device as claimed in claim 7, wherein the degree of dissimilarity is represented by the number of "1" bits in the result of the XOR step.