CN1119793C - Method for composing characteristic waveform of audio signals - Google Patents

Method for composing characteristic waveform of audio signals Download PDF

Info

Publication number
CN1119793C
CN1119793C CN 98118362 CN98118362A CN1119793C CN 1119793 C CN1119793 C CN 1119793C CN 98118362 CN98118362 CN 98118362 CN 98118362 A CN98118362 A CN 98118362A CN 1119793 C CN1119793 C CN 1119793C
Authority
CN
China
Prior art keywords
waveform
signature
audio signal
signature waveform
wave shapes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 98118362
Other languages
Chinese (zh)
Other versions
CN1245326A (en
Inventor
张景嵩
温世义
全晨
方国平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inventec Corp
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to CN 98118362 priority Critical patent/CN1119793C/en
Publication of CN1245326A publication Critical patent/CN1245326A/en
Application granted granted Critical
Publication of CN1119793C publication Critical patent/CN1119793C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Electrophonic Musical Instruments (AREA)

Abstract

The present invention relates to a method for synthesizing and choosing characteristic wave shapes of audio signals. Firstly, wave shapes which are waited to be processed are analyzed, and representative characteristic wave shapes can be chosen. When the characteristic wave shapes are stored, the characteristic wave shapes and correlation parameters thereof can be only recorded, the wave shapes of the whole signals are not need to record, and a great amount of memory space can be saved. When the characteristic wave shapes are subsequently synthesized and reduced, the wave shapes can be synthesized and reduced by reading out the characteristic wave shapes and the correlation parameters thereof, and an interpolation operational method. The synthesized tone quality is approach to the tone quality of adaptive differential pulse code modulation (ADPCM), and therefore, the present invention accords with the application area of a low speed central processing unit.

Description

The synthetic method of audio signal signature waveform
Technical field
The present invention relates to the audio signal treatment technology, particularly relate to audio signal signature waveform synthetic method.
Background technology
Because the development of Digital Electronic Technique, but make and become digital signal after the analog signal waveform mat analog/digital conversion, even so that storage, handle transmission, therefore, circulation that more can the accelerated electron data with share.
Known intercept signal Wave data and when being write down is decided on required precision usually, be with eight positions or 16 positions represent the amplitude of sampled waveform each point.If one section waveform is with 8K some sampling, and with the value after each sampling spot quantification of eight bit representations, then this section waveform must take the 64K position.In other words, if during the signal of record audio signal, with the sampling rate of per second 8K and with 8 quantifications, just then the signal that per second intercepted needs the storage space of 64K position.
Though mode is handled the processing that audio signal can be in real time (real time), the huge data of handling can take a large amount of storage spaces with above-mentioned pulse-code modulation (PCM:Pulse Code Modulation), to the application formation restriction greatly of its reality.If using adaptive differential pulse code modulation (ADPCM:Adaptive Differential Pulse Code Modulation) mode encodes, though can save the storage space of half, but for low-speed CPU (such as Z80,80386 etc.), because of algorithm is too complicated loaded down with trivial details, thereby can't handle in real-time mode.Therefore, in the application of low-speed CPU, seek a kind of audio signal disposal route, have concurrently simultaneously and can not take a large amount of storage spaces and function such as can handle in real time, become field personage for this reason to be expected.
Summary of the invention
Therefore, fundamental purpose of the present invention is to provide a kind of audio signal signature waveform synthetic method, can reduce the demand to storage space.
Another object of the present invention is to provide a kind of audio signal signature waveform synthetic method, and applicable dried low-speed CPU must be done real-time processing to audio signal.
For achieving the above object, the invention provides a kind of audio signal signature waveform synthetic method, this method comprises the steps: to intercept audio signal; The audio signal that is intercepted is taken a sample and quantification treatment, to form the file of WAV; Carry out choosing of signature waveform, selected signature waveform initial sum final position is preferably selected the amplitude place of equalling zero, and is all up or down trend to guarantee mutually for consistent with adjacent waveform junction; Store selected first signature waveform that can represent an audio signal and the time interval between second signature waveform and two characteristic waves; Read first signature waveform and second signature waveform stored; With the interpolation waveform that synthesizes with interpolation method therebetween.
For achieving the above object, the invention provides a kind of audio signal signature waveform synthetic method finishes, wherein, the cycle of first signature waveform is that Ma, amplitude are Aa[t], and the cycle of second signature waveform is Mb, amplitude is Ab[t], the time interval between this first signature waveform and this second signature waveform is L.According to the inventive method is to synthesize interpolation waveform between first signature waveform and this second signature waveform with interpolation method.And the amplitude of each interpolation waveform is:
Ar[t]=(L-K)/L×Ar′[t]+(1+k)/L×Ar′′[t];
The cycle of each interpolation waveform is:
Mr=Ma-r * (Ma-Mb)/(1+R) wherein,
r=1,2,...,R;
Wherein,
R=2L/(Ma+Mb);
Ar′[t]=Aa[(Ma/Mr)×t];
Ar′′[t]=Ab[(Mb/Mr)×t];
r=1,2,...,R;
T=0,1 ..., Mr-1; And
k=(M1+M2+...+M(r-1)),(M1+M2+...+M(r-1)+1),...,
(M1+M2+...+M(r-1)+(Mr-1))。
Description of drawings
For above-mentioned and other purposes of the present invention, feature and advantage can be become apparent, a preferred embodiment cited below particularly, and conjunction with figs. is described in detail below.
Fig. 1 shows one section audio signal oscillogram;
Fig. 2 is the signature waveform figure that shows after choosing;
Fig. 3 shows according to the present invention audio signal signature waveform synthetic method oscillogram after synthetic;
Fig. 4 is the process flow diagram that shows the inventive method; And
Fig. 5 is the process flow diagram that shows an embodiment of the inventive method.
Embodiment
Audio signal signature waveform synthetic method provided by the present invention is earlier pending waveform to be analyzed, and filters out representative signature waveform (characteristic waveform).When in when storage, only these signature waveforms and correlation parameter thereof need be write down, and the waveform of whole signal needn't be write down, so, but the just a large amount of storage space of economization.Because before filtering out signature waveform, audio signal waits processing through the sampling quantification, so selected signature waveform is discrete value (discrete value) according to its sampling rate.During follow-up synthetic reduction,, just can synthesize with an interpolative operation method again and restore waveform by reading these signature waveforms and correlation parameter thereof.This interpolative operation method is not to belong to complicated loaded down with trivial details method, so reduction rate is quite fast, just the waveform with 80486 central processing units reduction 4000K bit data amount is an example, only needs the scene of five seconds approximately.Therefore, the inventive method quite meets the application of low-speed CPU.Below just be described below in detail with regard to the inventive method.
If it is synthetic to talk about the audio signal signature waveform, then inevitable elder generation gets the characterization waveform is how to choose.Owing to comprise that the audio signal of voice, music, phoneme, audio etc. all has some common characteristic, promptly be to have quasi periodic in the section sometime, in addition, audio signal also has continuity.According to these two principal features, observe at one section audio signal waveform, select wherein representative signature waveform, and, also the length between the two adjacent signature waveforms is also given recording storage in the lump simultaneously these signature waveform recording storage.
Reduction for ease of waveform subsequent, be minimized excessive the beating and the noise that produces in junction between the audio signal waveform after synthetic, therefore, selected signature waveform initial sum final position is chosen in preferably that amplitude equals zero or near zero place, and be all up or down trend with adjacent waveform junction, to guarantee the phase place unanimity.This signature waveform is chosen step, on one side for example on one side the selected characteristic waveform, utilize audio signal signature waveform synthetic method of the present invention (as described later in detail) composite signal, listen the effect after synthesizing then; If undesirable, then choose again synthetic, until seek can obtain the signature waveform of optimum efficiency till.Moreover, also can use autocorrelation function and cross correlation function to come cycle of signal calculated, and selected characteristic waveform according to this.If audio signal belongs to voice signal, then the cycle of its signal clearly is easy to filter out representative signature waveform.
Figure 1 shows that one section audio signal oscillogram, according to this section waveform, choose two signature waveform A and B as shown in Figure 2 and stored, simultaneously the time span L between two signature waveforms is also given recording storage, this moment, length L was meant the starting point of the terminating point of signature waveform A to signature waveform B.Emphasize once more at this, because before filtering out signature waveform, audio signal waits processing through the sampling quantification, so selected signature waveform is discrete value (discretevalue) according to its sampling rate.
As mentioned above, through choose signature waveform A and B, waveform A is that one-period is that Ma, amplitude are Aa[t] waveform, waveform B is that one-period is that Mb, amplitude are Ab[t] waveform, the time interval between waveform A and the waveform B is L, so estimate to want the waveform number of times of interpolation to be at time interval L: R=2L/ (Ma+Mb);
The cycle Mr of each interpolation waveform is respectively: Mr=Ma-r * (Ma-Mb)/(1+R) wherein, r=1,2..., R; Waveform A presses Mr periodic extension: A1 ' [t]=Aa[(Ma/M1) * t] wherein, t=0,1 ..., M1-1; A2 ' [t]=Aa[(Ma/M2) * t] wherein, t=0,1 ..., M2-1; Ar ' [t]=Aa[(Ma/Mr) * t] wherein, t=0,1 ..., Mr-1; Waveform B is pressed Mr periodic extension: A1 ' ' [t]=Ab[(Mb/M1) * t] wherein, t=0,1 ..., M1-1; A2 ' ' [t]=Ab[(Mb/M1) * t] wherein, t=0,1 ..., M2-1; Ar ' ' [t]=Ab[(Mb/Mr) * t] wherein, t=0,1 ..., Mr-1; Moreover waveform A influences each synthetic waveform continuation successively in the ratio of (L-k)/L, and waveform B influences each synthetic waveform continuation successively in the ratio of (1+k)/L.Then each repetitive pattern amplitude after the reduction is: Ar[t]=(L-K)/L * Ar ' [t]+(1+k)/L * Ar ' ' [t];
Wherein,
r=1,2,...,R;
T=0,1 ..., Mr-1; And
k=(M1+M2+...+M(r-1)),(M1+M2+...+M(r-1)+1),...,
(M1+M2+...+M(r-1)+(Mr-1))。
In view of the above, the waveform that is synthesized by waveform A and waveform B promptly as shown in Figure 3.Originally needed storage whole section waveform shown in Figure 1, and after the audio signal signature waveform synthetic method, only needed stored waveform A and waveform B and time interval length L therebetween to get final product, so economization storage space significantly according to the present invention.
The inventive method is applicable to the processing audio signal, and the voice signal that wall writes down with WAV or PCM in this way is so can apply mechanically the basic format of WAV.
Signature waveform storage of the present invention can comprise the form storage that header area (header block) and data field (DataBlock) two blocks are formed, and now is described in detail as follows: The header area
This header area comprises some essential informations, and it comprises: file size, profile name type, Format Type, port number, sampling frequency value, per second average data transfer rate, PCM data sampling figure place and signature waveform number etc.The file data structure of this signature waveform can C language described as follows shown in:
Typedef struct<br/〉char RIFF[4];<br/〉long Whfilelen;<br/〉char BWSfmt[8];<br/〉long version;<br/〉int FormatTag;<br/〉int Channels;<br/〉long SamplePerSec;<br/〉long AvgBytesPersec<br/〉int blockalign;<br/〉int BitPerSample;<br/〉char data[4];<br/〉long SjpeWaveNum;<br/〉};<br/〉and,<br/〉AvgBytesPerSec=Channels * SamplePerSec * (BitPerSample/8);<br/〉Blockalign=Channels * (BitPerSample/8));<br/ 〉
The data field
The PCM sampled data of this data field storage feature waveform and signature waveform information parameter.For example, the storage format of one eight monophony pulse-code modulation data can be:
Information bit reaches With last signature waveform Sampling 1 Sampling 2 ......
Signature period Gap length Monophony Monophony
16 16 888
Wherein, information bit is that three positions are formed, and signature period is with 13 bit representations.The storage format of one eight two-channel pulse-code modulation data can be:
Information bit and signature period With last signature waveform gap length 1 monophony of taking a sample 2 monophonys of taking a sample ......
16 16 888 wherein, and information bit is that three positions are formed, and signature period is with 13 bit representations.The storage format of sixteen bit monophony pulse-code modulation data can be:
Information bit and signature period With last special microwave shape gap length 1 monophony of taking a sample is hanged down the word group The high word group of 2 monophonys of taking a sample 2 monophonys of taking a sample are hanged down the word group The high word group of 2 monophonys of taking a sample ......
16 16 88888 wherein, and information bit is that three positions are formed, and signature period is with 13 bit representations.The storage format of sixteen bit two-channel pulse-code modulation data can be:
Information bit and signature period With last special microwave shape gap length 1 L channel of taking a sample hangs down the word group The high word group of 2 L channels of taking a sample 2 R channels of taking a sample hang down the word group The high word group of 2 R channels of taking a sample ......
16 16 88888 wherein, information bit is that three positions are formed, and signature period is with 13 bit representations.
Three positions of the information bit of above-mentioned each form are to be used for the type of distinguishing characteristic waveform.For example audio signal to be chosen is the pronunciation of English individual character, and then signature waveform can be divided into consonant, vowel and quiet etc.If quiet, then 13 of the wave recording cycle positions are together with 29 positions altogether, 16 follow-up positions, in order to writing down this quiet length, so, can write down 512M sampling spot altogether; If quiet length surpasses this numerical value, then can take 4 positions again and write down quiet length.
When signature waveform is synthetic, being connected between interpolation waveform and signature waveform will produce noise if not very level and smooth.For fear of the appearance of this noise, when at the selected characteristic waveform, just should pay attention to the selection of the starting point of signature waveform, to select each starting point amplitude as far as possible be zero or be bordering on zero place.Therefore, guaranteed the level and smooth of waveform junction, then the sound that synthesizes according to this law is nature.
In above-mentioned signature waveform building-up process, the utilization interpolation method calculates the waveform number of required interpolation in the time interval L of two signature waveforms through choosing and the cycle of each interpolation waveform.Yet after synthetic reduction, the time span L ' that is made up of the interpolation waveform is little than L, difference therebetween between 0~less signature period length between.Consistent for guaranteeing institute's synthetic waveform with original waveform length, can be in the interpolation waveform even interpolation 1~2 points again, the two reaches unanimity to impel L ' and L.In addition, also can utilize a low-pass filter that audio signal is filtered, eliminate because of connecting the unsmooth noise that produces.
With reference to Figure 4 and 5, shown in be respectively process flow diagram of the inventive method and the process flow diagram of an embodiment (TTS:Textto Speech).
As shown in Figure 4, be the process flow diagram of the inventive method.At first, in step 40, from medium such as tape intercepting record audio signal thereon, if be applied to the text conversion voice technology, then this audio signal is meant the phoneme of being concluded by pronunciation rule.In step 42, the audio signal that is intercepted is taken a sample and quantification treatment again, in brief, do digitisation exactly and handle, so that form file as the WAV form.Then, in step 44, carry out choosing of signature waveform, reduction for ease of waveform subsequent, be minimized excessive the beating and produce noise in junction between the audio signal waveform after synthetic, so selected signature waveform initial sum final position is preferably selected amplitude to equal zero or approached zero place, and be all up or down trend, to guarantee the phase place unanimity with adjacent waveform junction.Can set up a working environment at present, on one side the selected characteristic waveform, utilize audio signal signature waveform synthetic method composite signal of the present invention on one side, listen the effect after synthetic then; As undesirable, then choose syntheticly again, end until seeking the signature waveform can obtain optimum efficiency.Moreover, also can use autocorrelation function and cross correlation function to come cycle of signal calculated, and selected characteristic waveform according to this.If audio signal belongs to voice signal, then the cycle of its signal clearly is easy to just can determine more suitable signature waveform.Then, in step 46, the time span between selected signature waveform and two signature waveforms is stored, afterwards, in step 48, read the signature waveform and the time interval, what read is first signature waveform that can represent an audio signal and second signature waveform of being stored, in step 50, it is synthetic to carry out signature waveform, at last, in step 52, sounding.
As shown in Figure 5, be depicted as the synthetic block flow diagram that the inventive method is applied to text conversion voice (TTS:Text toSpeech) technology.At first, in step 50, read word, this word for example be by the user inquire about a certain individual character, again in step 52, analyze the phonetic symbol combination of word, and, choose phoneme according to ad hoc rules in step 54, be example for example with English-word " HELLO ", can be cut into<* h according to rules of pronunciation 〉,<ha,<al,<lo,<o* etc. phoneme, wherein, symbol * representative is quiet.And step 56 is according to the synthetic selected phoneme of the inventive method, again in step 58, and with the synthetic word of the phoneme set of being synthesized, and in step 60, to this utterances of words.The detailed process of steps such as above-mentioned steps 50,52,54,58 has been exposed in each case such as application number 85112444 and 85112445, but it is not to be emphasis of the present invention, so repeat no more in this.
In sum, using audio signal signature waveform synthetic method of the present invention, is that audio signal is screened representative signature waveform, and follow-up synthesizing with interpolation method according to signature waveform again reduces.Yet, its compressibility and reduction effect end rely the original audio signal waveform that is selected, the inventive method is to music and audio test, at 8K sampling rate, 8 quantifications, transfer rates is the original audio signal of 64Kbits/sec, its speed is approximately between 8~32Kbits/sec, this speed is between adaptive differential pulse code modulation (ADPCM) and vector sum Excited Linear Prediction (VSELP), yet the tonequality that it synthesized is near adaptive differential pulse code modulation (ADPCM).
Though the present invention discloses as above with preferred embodiment; but it is not that any those skilled in the art is not in breaking away from spiritual scope of the present invention in order to qualification the present invention; can do to change and retouching, so protection scope of the present invention should be as the criterion with the scope that claim was defined.

Claims (4)

1. an audio signal signature waveform synthetic method is characterized in that this method comprises the steps:
The intercepting audio signal;
The audio signal that is intercepted is taken a sample and quantification treatment, to form the file of WAV;
Carry out choosing of signature waveform, selected signature waveform initial sum final position is selected the amplitude place of equalling zero, and is all up or down trend to guarantee that phase place is consistent with adjacent waveform junction;
Store selected first signature waveform that can represent an audio signal and the time interval between second signature waveform and two characteristic waves;
Read first signature waveform and second signature waveform stored; With
Synthesize therebetween interpolation waveform with interpolation method.
2. audio signal signature waveform synthetic method as claimed in claim 1, wherein, the step of choosing of carrying out signature waveform comprises that also the use autocorrelation function comes the cycle of signal calculated with the selected characteristic waveform.
3. audio signal signature waveform synthetic method as claimed in claim 1, wherein, the cycle of this first signature waveform is that Ma, amplitude are Aa[t], the cycle of this second signature waveform is that Mb, amplitude are Ab[t], the time interval between this first signature waveform and this second signature waveform is L.
4. this audio signal signature waveform synthetic method as claimed in claim 3, wherein, the relation of this interpolation waveform is as follows:
The amplitude of each this interpolation waveform is:
Ar[t]=(L-k)/L×Ar′[t]+(1+k)/L×Ar′′[t];
The cycle of each interpolation waveform is:
Mr=Ma-r * (Ma-Mb)/(1+R) wherein,
r=1,2,...,R;
Wherein,
R=2L/(Ma+Mb);
Ar′[t]=Aa[(Ma/Mr)×t];
Ar′′[t]=Ab[(Mb/Mr)×t];
r=1,2,...,R;
T=0,1 ..., Mr-1; And
k=(M1+M2+...+M(r-1)),(M1+M2+...+M(r-1)+1),...,
(M1+M2+...+M(r-1)+(Mr-1))。
CN 98118362 1998-08-17 1998-08-17 Method for composing characteristic waveform of audio signals Expired - Fee Related CN1119793C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 98118362 CN1119793C (en) 1998-08-17 1998-08-17 Method for composing characteristic waveform of audio signals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 98118362 CN1119793C (en) 1998-08-17 1998-08-17 Method for composing characteristic waveform of audio signals

Publications (2)

Publication Number Publication Date
CN1245326A CN1245326A (en) 2000-02-23
CN1119793C true CN1119793C (en) 2003-08-27

Family

ID=5226034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 98118362 Expired - Fee Related CN1119793C (en) 1998-08-17 1998-08-17 Method for composing characteristic waveform of audio signals

Country Status (1)

Country Link
CN (1) CN1119793C (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100421153C (en) * 2004-10-22 2008-09-24 顾稚敏 Prestore type language recognition system and its method
CN101710488B (en) * 2009-11-20 2011-08-03 安徽科大讯飞信息科技股份有限公司 Method and device for voice synthesis
CN103903610B (en) * 2012-12-27 2017-02-08 北京谊安医疗系统股份有限公司 Method for making synthesis audio data

Also Published As

Publication number Publication date
CN1245326A (en) 2000-02-23

Similar Documents

Publication Publication Date Title
CN103258541B (en) Adaptive time/frequency-based audio encoding and decoding apparatuses and methods
US4864620A (en) Method for performing time-scale modification of speech information or speech signals
EP1386312B1 (en) Improving transient performance of low bit rate audio coding systems by reducing pre-noise
CN1270292C (en) Speech bandwidth extension and speech bandwidth extension method
CN1205755C (en) Audio decoding method and apparatus which recover high frequency component with small computation
CN1262990C (en) Audio coding method and apparatus using harmonic extraction
US6785644B2 (en) Alternate window compression/decompression method, apparatus, and system
CN1185616A (en) Audio-frequency bandwidth-expanding system and method thereof
JP2003122400A (en) Signal modification based upon continuous time warping for low bitrate celp coding
CN1322406A (en) Lossless compression encoding method and device, and lossless compression decoding method and device
JPH08251030A (en) System for providing high-speed and low-speed reproducibility memory and retrieving system as well as method of providing high-speed and low-speed reproducibility
EP0605348A2 (en) Method and system for speech data compression and regeneration
CN1717718A (en) Sinusoidal audio coding
CN1237507C (en) Editing of audio signals
CN1119793C (en) Method for composing characteristic waveform of audio signals
JP2007518374A (en) Signal processing using look-ahead modulators with time-weighted error values
CN1266672C (en) Audio decoding method and apparatus for reconstructing high frequency components with less computation
CN1223087C (en) Spectrum modeling
JPH0993135A (en) Coder and decoder for sound data
US6480550B1 (en) Method of compressing an analogue signal
CN116884431A (en) CFCC (computational fluid dynamics) feature-based robust audio copy-paste tamper detection method and device
US20060271374A1 (en) Method for compression and expansion of digital audio data
CN1071769A (en) The method that a kind of voice signal to the people is encoded and deciphered
JP3058640B2 (en) Encoding method
JP2000099093A (en) Acoustic signal encoding method

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20030827

Termination date: 20100817