CN1242860A - Sound encoder and sound decoder - Google Patents

Sound encoder and sound decoder Download PDF

Info

Publication number
CN1242860A
CN1242860A CN 98801556 CN98801556A CN1242860A CN 1242860 A CN1242860 A CN 1242860A CN 98801556 CN98801556 CN 98801556 CN 98801556 A CN98801556 A CN 98801556A CN 1242860 A CN1242860 A CN 1242860A
Authority
CN
China
Prior art keywords
vector
sound
source
code
dispersal pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 98801556
Other languages
Chinese (zh)
Other versions
CN100367347C (en
Inventor
安永和敏
森井利幸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Godo Kaisha IP Bridge 1
Original Assignee
松下电器产业株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 松下电器产业株式会社 filed Critical 松下电器产业株式会社
Publication of CN1242860A publication Critical patent/CN1242860A/en
Application granted granted Critical
Publication of CN100367347C publication Critical patent/CN100367347C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Abstract

An excitation vector generator comprises a pulse vector generating section having N channels (N>=1) for generating pulse vectors, a storing section for storing M (M>=1)kinds of dispersion patterns every channel in accordance with N channels, a selecting section for selectively taking out a dispersion pattern from the storing section every channel, a dispersion section for performing a superimposing calculation of the extracted dispersion pattern and the generated pulse vectors every channel so as to generate N dispersion vectors, excitation vector generating section for generating an excitation vector from N dispersion vectors generated.

Description

Voice coder and sound signal decoder
Technical field
The present invention relates to voice coder and sound signal decoder that high efficient coding and decoding speech information are used.
Background technology
Now, developing the voice coding technology that high efficient coding and decoding speech information are used." Code Excited Linear Prediction: low bit rate high quality speech " (Code Excited Linear Prediction:High Quality Speechat Low Bit Rate) (M.R.Schroeder work; Be published in ICASSP ' 85, pp.937~940) in putting down in writing CELP type voice coder based on this voice coding technology.This voice coder carries out linear prediction to each frame of dividing input speech gained with the set time, ask prediction residual (pumping signal) by the linear prediction of every frame, and with depositing adaptive codebook that drives source of sound and the noise code book of depositing a plurality of noise code vectors over this prediction residual coding.
In the past the functional block diagram of CELP type voice coder (hereinafter referred is " voice encryption device ") shown in Fig. 1.
Linear prediction analysis unit 12 is carried out linear prediction analysis to the voice signal of importing in this CELP type voice coder 11.Utilize this linear prediction analysis, can obtain linear predictor coefficient.Linear predictor coefficient is the parameter of the spectrum envelope characteristic of expression voice signal 11.The linear predictor coefficient of linear prediction analysis unit 12 gained quantizes at linear predictor coefficient coding unit 13, and the linear predictor coefficient after will quantizing is delivered to linear predictor coefficient decoding unit 14.To quantize gained again quantizes number to be input to coding output unit 14 as the linear prediction sign indicating number.After linear predictor coefficient decoding unit 14 is decoded linear predictor coefficient coding unit 13 gained quantized linear prediction coefficients, obtain the coefficient of composite filter, linear predictor coefficient decoding unit 14 outputs to composite filter 15 with the coefficient of composite filter.
Adaptive codebook 17 is the code book of the multiple candidate adaptive code vector of output, is made of the buffer of depositing multiframe driving source of sound in the past.The adaptive code vector is the time series vector of periodic component in the performance input speech.
Noise code book 18 is for depositing the code book of multiple candidate noise code vector, and its kind is corresponding to the bit number that is distributed.The noise code vector is the time series vector of aperiodic component in the performance input speech.
Noise code book 18 is for depositing the code book of multiple candidate noise code vector, and its kind is corresponding to the bit number that is distributed.The noise code vector is the time series vector of aperiodic component among the performance input bosom friend.
Adaptive code gain weighted units 19 and noise code gain weighted units 20 output to totalizer 22 after respectively the candidate vector of adaptive codebook 17 and 18 outputs of noise code book being multiply by the adaptive gain of reading from weighting code book 21 and noise code gain.
The weighting code book is a kind of storer, deposits multiple respectively and weighted number that candidate adaptive code vector multiplies each other and the weighted number that multiplies each other with candidate noise code vector, and its kind is corresponding to the bit number that is distributed.
Totalizer 22 is candidate adaptive code vector and the candidate noise code vector addition after adaptive code gain weighted units 19,20 weightings of noise code gain weighted units respectively, produces candidate and drives the source of sound vector, and output to composite filter 15.
The full polar form wave filter that composite filter 15 constitutes for the composite filter coefficient by linear predictor coefficient decoding unit 14 gained.Composite filter 15 has a kind of function, when input drives the source of sound vector from the candidate of totalizer 22, and the synthetic speech vector of output candidate.
Distortion computation unit 16 calculates the output (being the synthetic speech vector of candidate) of composite filter 15 and the distortion between the input speech 11, and the gained distortion value outputs to sign indicating number regulation unit 23.Sign indicating number number regulation unit 23 makes three kinds of sign indicating numbers number (adaptive code number, noise code number and weighted code number) of the distortion minimum of calculating in the distortion computation unit 16 respectively to three kinds of code books (adaptive codebook, noise code book and weighting code book) regulations.Then, three kinds of sign indicating numbers of sign indicating number regulation unit 23 defineds number are outputed to coding output unit 24.Coding output unit 24 converges the adaptive code number, noise code number of the linear prediction sign indicating number number of linear predictor coefficient coding unit 13 gained and yard number regulation unit 23 defineds and weighted code number, and outputs to transmission line.
The functional block diagram of the CELP type sound signal decoder of shown in Fig. 2 the signal of above-mentioned encoder encodes being decoded (hereinafter referred is " voice decoder ").In this sound signal decoder, coding input block 31 receives the coding that voice coder (Fig. 1) is sent here, the coding that receives is decomposed into linear prediction sign indicating number number, adaptive code number, noise code number and weighted code number, and will decomposes the gained coding and output to linear predictor coefficient decoding unit 32, adaptive codebook 33, noise code book 34 and weighting code book 35 respectively.
Then, linear predictor coefficient decoding unit 32 will be encoded after the input block 31 gained linear prediction sign indicating numbers number decoding, obtain the composite filter coefficient, and output to composite filter 39.Then, from adaptive codebook, read the adaptive code vector, read and noise code number corresponding noise code vector, and then gain from adaptive code gain and the noise code that the weighting code book is read the weighted code correspondence from the noise code book with number corresponding position of adaptive code.And, after adaptive code weighted units 36 multiply by the adaptive code vector adaptive code gain, deliver to totalizer 38.Equally, after noise code vector weighting unit 37 multiply by the noise code vector noise code gain, deliver to totalizer 38.
Totalizer 38 with above-mentioned two coded vector additions after, produce to drive the source of sound vector, and the driving source of sound that will produce delivers to adaptive codebook 33, to upgrade buffer, this driving source of sound is also delivered to composite filter 39, with the driving wave filter.Composite filter 39 drives the source of sound vector by totalizer 38 gained and drives, and reproduces synthetic speech with the output of linear predictor coefficient decoding unit 32.
Calculate the distortion E that asks at the distortion computation unit of CELP type voice coder 16 general using following formulas (formula 1):
E=‖V-(gaHP+gcHC)‖ 2?????????????????????(1)
V: input voice signal (vector)
H: composite filter impulse response convolution matrix
Wherein, h is the impulse response (vector) of composite filter, and L is a frame length,
P: adaptive code vector
C: noise code vector
Ga: adaptive code gain
Gc: noise code gain
Here, in order to make the distortion E minimum of formula (1), need be to whole combinations closed loop calculated distortion of adaptive code number, noise code number and weighted code number, stipulate each yard number.
Yet, formula (1) is carried out closed loop retrieval, and then the calculation process amount is excessive, thus generally at first with adaptive codebook by vector quantization regulation adaptive code number, secondly by the vector quantization regulation noise code that adopts the noise code book number, at last by the vector quantization regulation weighted code that adopts the weighting code book number.Now just in this case, the vector quantization processing of adopting the noise code book is described in further detail.
Adaptive code number and adaptive code gain pre-determine or when temporarily determining, the distortion estimation formula of formula (1) becomes following formula (2).
Ec=‖X-gcHC‖????????????????????????(2)
Wherein, the vector X in the formula (2) predesignates or the adaptive code of interim provision number and adaptive code gain for adopting, and the noise source information of being tried to achieve by following formula (3) (regulation noise code number with target vector).
X=V-gaHP?????????????????????????????(3)
Ga: adaptive code gain
V: voice signal (vector)
H: composite filter impulse response convolution matrix
P: adaptive code vector
Under the situation of regulation noise code gain gc after the regulation noise code number, generally know that but the gc in the assumption (2) can get arbitrary value, make the noise code vector number of the replaceable fractional expression maximum that makes following formula (4) for regulation of the processing (vector quantization of noise source information is handled) of the minimum regulation noise code vector of formula (2) number. ( X 1 Hc ) 2 | | Hc | | 2 . . . . . . . . . ( 4 )
That is, under the situation of the previous existing or interim provision of adaptive code number and adaptive code gain, noise source information vector quantification treatment becomes the processing that regulation makes the candidate noise code vector number of formula (4) the fractional expression maximum that distortion computation unit 16 calculates.
In the CELP type encoder/decoder at initial stage, storage kind and institute's allocation bit in the storer are counted the data of random number corresponding sequence gained as the noise code book.Yet, there is following problem: the very large memory capacity of needs, huge to the calculation process amount that the distortion of each candidate noise code vector meter formula (4) is used simultaneously.
As a kind of method that solves this problem, can enumerate " the 8kb/s ACELP speech coding of 10 milliseconds of Speech frames of employing: candidate's ccitt standard " (" 8KBIT/S ACELP CODING OF SPEECH WITH 10MS SPEECH-FRAME:A CANDIDATE FOR CCITT STANDARDIZATION ") (R.Salami.C.Laflamme and J-P.Adoul work, publish in ICASSP ' 94, pp.II-97~II-100,1994) put down in writing etc. like that, adopt the CELP type voice coder/demoder that produces the algebraically source of sound vector generation unit of source of sound vector in the algebraically mode.
Yet, the noise code book adopts in the CELP type voice coder/demoder of above-mentioned algebraically source of sound generation unit, the normal noise source information of trying to achieve by formula (3) (regulation noise code number target vector) with the performance of a small amount of impulse approximation, thereby have limitation aspect the speech quality seeking to improve.Actual key element of watching noise source information X in the formula (3) does not then almost only constitute the situation of this key element with a small amount of pulse.Can illustrate thus and have limitation.The present invention discloses
The object of the present invention is to provide a kind of new source of sound vector generator, the shape of gained source of sound vector and the high source of sound vector of statistics similarity when this device can produce with the actual analysis voice signal.
Another purpose of the present invention is to provide a kind of CELP voice coder/demoder, voice signal communication system and voice signal register system, they by with above-mentioned source of sound vector generator as the noise code book, the high synthetic speech of quality in the time of can obtaining than algebraically source of sound generation unit as the noise code book.
The present invention's the 1st form is a kind of source of sound vector generator, it is characterized in that, comprising: have the pulse vector generation unit that N (N 〉=1) is created on the passage of the pulse vector of a certain key element foundation band polarity unit pulse on the vector axle; Have the function of each passage M (M 〉=1) kind dispersal pattern among the described N of storage, have the dispersal pattern bank select bit of selecting a kind of function of dispersal pattern from depositing M kind dispersal pattern simultaneously; Have each passage and carry out the stack computing of described pulse vector pulse vector that generation unit is exported and the selected dispersal pattern of described dispersal pattern bank select bit, and produce the pulse vector diffusion unit of the function of N diffusion vector; After having N the diffusion vector addition that described pulse vector diffusion unit is produced, produce the diffusion vector addition device of the function of source of sound vector.Make described pulse vector generation unit have the function that produces N pulse vector (N 〉=1) in the algebraically mode, add described dispersal pattern bank select bit and store the dispersal pattern that obtains by the shape (characteristic) of learning the actual voice vector in advance in advance, thereby can produce the shape ratio algebraically source of sound generation unit in the past source of sound vector of approaching actual source of sound shape vector better.
The present invention's the 2nd form is a kind of CELP voice coder/demoder, it is characterized in that, adopts described source of sound vector generator in the noise code book.Compare with the voice coder/demoder that adopts algebraically source of sound generation unit in the noise code book in the past, can produce more source of sound vector near true form, therefore, can obtain voice coder/demoder, voice signal communication system and the voice signal register system of the higher synthetic speech of exportable quality.
Summary of drawings
Fig. 1 is the functional block diagram of CELP type voice coder in the past.
Fig. 2 is the functional block diagram of CELP type sound signal decoder in the past.
Fig. 3 is the functional block diagram of the relevant source of sound vector generator of the present invention's the 1st example.
Fig. 4 is the functional block diagram of the relevant CELP type of the present invention's the 2nd example voice coder.
Fig. 5 is the functional block diagram of the relevant CELP type of the present invention's the 2nd example sound signal decoder.
Fig. 6 is the functional block diagram of the relevant CELP type of the present invention's the 3rd example voice coder.
Fig. 7 is the functional block diagram of the relevant CELP type of the present invention's the 4th example voice coder.
Fig. 8 is the functional block diagram of the relevant CELP type of the present invention's the 5th example voice coder.
Fig. 9 is the block diagram of vector quantization function in the 5th example.
Figure 10 is the key diagram that extracts the algorithm of target in the 5th example.
Figure 11 is the functional block diagram of predictive quantization in the 5th example.
Figure 12 is the functional block diagram of predictive quantization in the 6th example.
Figure 13 is the functional block diagram of CELP type voice coder in the 7th example.
Figure 14 is the functional block diagram of distortion computation unit in the 7th example.
Implement optimal morphology of the present invention
Utilize description of drawings example of the present invention below.
(the 1st example)
The functional block diagram of the relevant source of sound vector generator of the form of the invention process shown in Fig. 3.This source of sound vector generator comprises: the pulse vector generation unit 101 with a plurality of passages; Dispersal pattern bank select bit 102 with dispersal pattern storage unit and switch; The pulse vector diffusion unit 103 of diffusion pulse vector; The diffusion vector addition device 104 of a plurality of channel pulse vector additions with diffusion.
Pulse vector generation unit 101 has N passage (situation to N=3 in this example describes), and these passages are created on the vector (pulse vector hereinafter referred to as) of a certain key element configuration band polarity unit pulse on the vector axle.
Dispersal pattern bank select bit 102 has storage unit M1~M3 and switch SW 1~SW3, the former is to each passage store M kind dispersal pattern (situation to M=2 in this example describes), and the latter selects a kind of dispersal pattern the M kind dispersal pattern respectively from each storage unit M1~M3.
Pulse vector diffusion unit 103 carries out the stack computing of pulse vector generation unit 101 pulse vector of exporting and dispersal pattern bank select bit 102 dispersal pattern of exporting to each passage, and produces N diffusion vector.
After N the diffusion vector addition of diffusion vector addition device 104 with 103 generations of pulse vector diffusion unit, generate source of sound vector 105.
In this example, paired pulses vector generation unit 101 is put down in writing rule and is described with the situation that the algebraically mode produces N pulse vector (N=3) according to tabulating down 1.
Table 1
Figure A9880155600151
The running of such source of sound vector generator that constitutes mentioned above is described.Dispersal pattern bank select bit 102 is stored respectively 2 kinds the dispersal pattern from every passage and is selected a kind, and outputs to pulse vector diffusion unit 103.But, corresponding to dispersal pattern combination (the combination sum M of selecting N=8 kinds), the specific assigned number.
Then, pulse vector generation unit 101 generates the pulse vector (being 3 in this example) of number of channels share according to the rule of table 1 record in the algebraically mode.
Pulse vector diffusion unit 103 is done the stack computing with the dispersal pattern of dispersal pattern bank select bit 102 selections and the pulse of pulse vector generation unit 101 generations with formula (5), and vector is spread in each passage generation. ci ( n ) = Σ k = 0 L - 1 wij ( n - k ) di ( k ) . . . . . . . ( 5 )
Wherein, n:0~L-1
L: diffusion vector length
I: channel number
J: dispersal pattern number (j=1~M)
Ci: the diffusion vector of passage i
Wij: the j kind dispersal pattern of passage i
The vector length of Wij (m) be 2L-1 (m:-(L-1)~L-1), but, in 2L-1 the key element
Can setting be the Lij key element, other key element is zero
Di: the pulse vector of passage i
di=±δ(n-pi),n=0~L-1
Pi: the candidate pulse vector of passage i
Diffusion vector addition device 104 produces source of sound vector 105 after utilizing 3 diffusion vector additions of formula (6) with 103 generations of pulse vector diffusion unit. c ( n ) = Σ i = 1 N ci ( n ) . . . . . . . ( 6 )
C: source of sound vector
Ci: diffusion vector
I: channel number (i=1~N)
N: vector key element number (n=0~L-1, wherein L is the source of sound vector length)
The source of sound vector generator of Gou Chenging like this, combined method by making dispersal pattern bank select bit 102 selected dispersal patterns and pulse position and the polarity zone in 101 production burst vectors of pulse vector generation unit change, and can produce various source of sound vector.
So, the source of sound vector generator that below constitutes like that, to the combined method of dispersal pattern bank select bit 102 selected dispersal patterns and 2 kinds of information such as combined method of 101 production burst shape vectors of pulse vector generation unit (pulse position and pulse polarity), can allocate respectively number one to one in advance.In the dispersal pattern bank select bit 102, also can learn in advance, and store the dispersal pattern of this learning outcome gained in advance according to actual source of sound information.
If adopt above-mentioned source of sound vector generator in the source of sound information generating unit of voice coder/demoder, 2 kinds of numbers such as the then combination of combination by transmitting the selected dispersal pattern of dispersal pattern bank select bit number and pulse vector generation unit institute production burst vector number (can predetermined pulse position and pulse polarity).Can realize the transmission of noise source information.
Again, during the source of sound vector generation unit that constitutes like that more than the employing, compare when adopting the pulse source of sound that generates in the algebraically mode, can produce shape (characteristic) the source of sound vector similar with actual source of sound information.
In this example, the situation of 2 kinds of dispersal patterns of dispersal pattern bank select bit 102 each passage of storage is described, but to the dispersal pattern beyond 2 kinds of each channel allocations the time, also can obtain same effect and effect.
Paired pulses vector generation unit 101 is made up of 3 passages and is described based on the situation that table 1 is put down in writing the pulse create-rule in this example, but not simultaneously in the number of channel, and the pulse create-rule also can be obtained same effect and effect when adopting table 1 put down in writing in addition regular.
In addition, composition has the voice signal communication system or the voice signal register system of above-mentioned source of sound vector generator or voice coder/demoder, can obtain effect and effect that above-mentioned source of sound vector generator is had.
(the 2nd example)
The functional block diagram of the relevant CELP type of this example shown in Fig. 4 voice coder, the functional block diagram of the relevant CELP type of this example shown in Fig. 5 sound signal decoder.
The CELP type voice coder of relevant this example is used the illustrated source of sound vector generator of the 1st example in the noise code book of above-mentioned Fig. 1 CELP type voice coder.The CELP type sound signal decoder relevant with this example in the noise code book of above-mentioned Fig. 2 CELP sound signal decoder, used the source of sound vector generator of above-mentioned the 1st example.Therefore, the processing except that noise source information vector quantification treatment, all the device with above-mentioned Fig. 1, Fig. 2 is identical.In this example, be center explanation voice coder and sound signal decoder with noise source information vector quantification treatment.And, identical with the 1st example, also establish port number N=3, the dispersal pattern of a passage is counted M=2, and the generation of pulse vector is according to Fig. 1.
It is the processing of regulation 2 kinds of numbers making formula (4) reference value maximum (dispersal pattern combination number, pulse position and pulse polarity combination number) that noise source vector quantization in Fig. 4 voice coder is handled.
When Fig. 3 source of sound vector generator is used as the noise code book, with closed loop regulation dispersal pattern combination number (8 kinds) and pulse vector combination number (considering that polarity chron is 16384 kinds).
Therefore, dispersal pattern bank select bit 215 is at first selected a kind of dispersal pattern from 2 kinds of dispersal patterns itself that store, and outputs to pulse vector diffusion unit 217.Then, pulse vector generation unit 216 produces the pulse vector (in this example being 3) of port number share with algebraic method, and outputs to pulse vector diffusion unit 217 according to the rule of Fig. 1.
Pulse vector diffusion unit 217 spreads the dispersal pattern of dispersal pattern bank select bit 215 selections and the pulse vector of pulse vector generation unit 216 generations to vector with the stack computing of formula (5) to each passage generation.
After the diffusion vector addition of diffusion vector addition device 218 with 217 acquisitions of pulse vector diffusion unit, generate source of sound vector (becoming candidate noise code vector).
Then, the value of the formula (4) that adopts diffusion vector addition device 218 gained candidate noise code vectors is calculated in distortion computation unit 206.Whole combinations of the pulse vector that the rule of pressing table 1 is produced, carry out the computing of the value of above-mentioned formula (4), and the dispersal pattern combination when maximum number with the value of its Chinese style (4), pulse vector combination number (combination of pulse position and polarity thereof), and maximal value at that time outputs to sign indicating number regulation unit 213.
Then, dispersal pattern bank select bit 215 is selected and the previous different dispersal pattern of just having selected of combination from gained storage dispersal pattern.Then, the just combination of the new dispersal pattern of selecting and changing with mentioned above identical, is calculated the value of formula (4) to the rule by table 1 in whole pulse vector combinations that pulse vector generation unit 216 produces.The dispersal pattern combination when maximum number with its Chinese style (4) once more, pulse vector combination number and maximal value output to sign indicating number regulation unit 213 once more.
To carrying out above-mentioned processing repeatedly from whole combinations (combination adds up to 8 explanation of this example) that dispersal pattern bank select bit 215 dispersal pattern of depositing is selected.
Sign indicating number regulation unit 213 compares whole 8 vector maximal values that calculate distortion computation unit 206, select wherein maximum, 2 kinds of combinations when regulation produces this maximal value number (dispersal pattern combination number, pulse vector combination number), and number output to coding output unit 214 as noise code.
On the other hand, in the sound signal decoder of Fig. 5, coding input block 301 receives the coding that voice coder (Fig. 4) is sent here, the coding that receives is resolved into corresponding linear prediction number, adaptive code number, noise code number (being made up of for 2 kinds dispersal pattern combination number, pulse vector combination number etc.) and weighted code number, and the coding that will decompose gained outputs to linear predictor coefficient decoding unit 302, adaptive codebook 303, noise code book 304 and weighting code book 305 respectively.
In the noise code number, the dispersal pattern combination number outputs to dispersal pattern bank select bit 311, pulse vector combination and number outputs to pulse vector generation unit 312.
Then, linear predictor coefficient decoding unit 302 is obtained number decoding of linear prediction sign indicating number the composite filter coefficient, and is outputed to composite filter 309.At adaptive codebook 303, from reading the adaptive code vector with number corresponding position of adaptive code.
In the noise code book 304, dispersal pattern bank select bit 311 is read the dispersal pattern number corresponding with spreading pulse combined and is outputed to pulse vector diffusion unit 313 each passage; Pulse vector generation unit 312 produce the port number shares with number corresponding pulse vector of pulse vector combination and output to pulse vector diffusion unit 313; Dispersal pattern that pulse vector diffusion unit 313 will receive from dispersal pattern bank select bit 311 and the pulse vector that receives from pulse vector generation unit 312 produce the diffusion vector with the stack computing of formula (5), and output to diffusion vector addition device 314.After the diffusion vector addition of diffusion vector addition device 314 with each passage of pulse vector diffusion unit 313 generations, produce the noise code vector.
Read and weighted code number corresponding adaptive code gain and noise code gain from weighting code book 305, and in adaptive code vector weighting unit 306 the adaptive code vector be multiply by the adaptive code gain, equally, after noise code weighted units 307 multiply by the noise code vector noise code gain, deliver to totalizer 308.
Totalizer 308 will be multiplied by above-mentioned 2 code vector additions of gain, generate drive the source of sound vector, and the driving source of sound vector that will generate outputs to adaptive codebook 303, so that upgrade buffer, also output to composite filter 309, so that the driving composite filter.
After the driving source of sound vector of composite filter 309 usefulness totalizers 308 gained drives, the synthetic speech 310 of regeneration.Again, adaptive codebook 303 usefulness are upgraded buffer from the driving source of sound vector that totalizer 308 receives.
But, dispersal pattern bank select bit among Fig. 4 and Fig. 5 is taken as is used as the distortion computation reference type of the source of sound vector gained formula (7) of substitution formula (6) record among the C in the formula (2) as cost function, and study in advance, after making the value of this cost function less, the dispersal pattern of study gained is stored by each passage.
Pass through aforesaid operations, can generate the similar source of sound vector of shape of shape and actual noise source information (vector X in the formula (4)), thereby with the noise code book in adopt the CELP voice coder/demoder of algebraically source of sound vector generation unit to compare, can obtain the high synthetic speech of quality. Ec = | | X - gcH Σ i = 1 N Ci | | 2 = Σ n = 0 L - 1 ( X ( n ) - gcH Σ i = 1 N Ci ( n ) ) 2 = Σ n = 0 L - 1 ( X ( n ) - gcH Σ i = 1 N Σ k = 0 L - 1 Wij ( n - k ) di ( k ) ) 2 . . . . . ( 7 )
X: regulation noise code number with target vector
Gc: noise code gain
H: composite filter impulse response convolution matrix
C: noise code vector
I: channel number (i=1~N)
J: dispersal pattern number (i=1~M)
Ci: the diffusion vector of passage i
Wij: the j kind dispersal pattern of passage i
Di: the pulse vector of passage i
L: the source of sound vector length (n=0~L-1)
In this example to each passage in advance store M learn in advance, the situation of the less back dispersal pattern that obtains of cost function value of formula (7) is described, but in fact M dispersal pattern needn't all be obtained by study, at least in advance store a kind of dispersal pattern of obtaining by study if make each passage, also can obtain the effect and the effect that improve synthetic speech quality in this case.
Situation about illustrating in this example is, according to whole combinations of dispersal pattern dispersal pattern that bank select bit is stored and whole combinations of the pulse vector generation unit 6 candidate pulse vector position that generates, make the combination number of reference value maximum in the formula (4) with closed loop regulation.Yet, make the parameter (The perfect Gain of adaptive code vector etc.) of trying to achieve before the noise code this shop according to the rules, carry out preliminary election, or retrieve etc. with open loop, also can obtain same effect and effect.
In addition, have the voice signal communication system or the voice signal register system of above-mentioned voice coder/demoder, can obtain effect and effect that the source of sound vector generator put down in writing in the 1st example is had by formation.
(the 3rd example)
The functional block diagram of the relevant CELP type of this example shown in Fig. 6 voice coder.This example adopts in the CELP voice encryption device of above-mentioned the 1st example source of sound vector generator in the noise code book, desirable adaptive code yield value with trying to achieve before the retrieval noise code book carries out the preliminary election of dispersal pattern dispersal pattern that bank select bit is deposited.Except that noise code book periphery, all the CELP type voice coder with Fig. 4 is identical.Therefore, the noise source information vector quantification treatment in this example key diagram 6 CELP type voice coders.
Noise code book 408, noise code gain weighted units 410, composite filter 405, distortion computation unit 406, sign indicating number regulation unit 413, dispersal pattern bank select bit 415, pulse vector generation unit 416, the pulse vector diffusion unit 417 that this CELP type voice coder has adaptive codebook 407, adaptive code gain weighted units 409, be made of example 1 illustrated source of sound vector generator, spread vector addition device 418 and adaptive gain identifying unit 419.
But in this example, at least a in the M kind dispersal pattern (M 〉=2) of above-mentioned dispersal pattern bank select bit 415 storages is to learn in advance, and the quantizing distortion that produces when the noise source information vector is quantized is less, and the dispersal pattern that is obtained by this learning outcome.
For illustrative ease, the port number N that establishes the pulse vector generation unit in this example is 3, the diffusion pulse species number M of dispersal pattern each passage that bank select bit is stored is 2, and take off the row situation and describe: a kind of of M kind dispersal pattern (M=2) is the dispersal pattern that is obtained by above-mentioned study, and another kind is the random vector string (random pattern hereinafter referred to as) that is generated by the random number vector generator.By the way, the above-mentioned dispersal pattern that is obtained by study as the W11 among Fig. 3, obviously is the relatively shorter pulse type dispersal pattern of length.
In the CELP type voice coder of Fig. 6, decide the processing of adaptive code this shop in the noise source information vector professional etiquette that quantizes to advance.Therefore, in the moment of carrying out noise source information vector quantification treatment, can be with reference to the vector of adaptive codebook number (adaptive code number) and the gain of desirable adaptive code (temporary transient determine).In this example, use desirable adaptive code yield value wherein spreads the preliminary election of pulse.
Particularly, at first, after the adaptive codebook retrieval finished, the adaptive code gain ideal value that immediately sign indicating number regulation unit 413 is kept outputed to distortion computation unit 406.The adaptive code gain that distortion computation unit 406 will receive from sign indicating number regulation unit 413 outputs to adaptive gain identifying unit 419.
The 419 pairs of desirable adaptive gain values that receive from distortion computation unit 409 of adaptive gain identifying unit and the size of predefined threshold value compare.Then, adaptive gain identifying unit 419 is according to above-mentioned size result relatively, and the control signal of usefulness is in advance delivered to dispersal pattern bank select bit 415.The content of control signal above-mentioned size relatively in adaptive code gain when big, when indication is selected to learn, to make the noise source information vector to quantize in advance the dispersal pattern that obtains after less of the quantizing distortion that produces, and above-mentioned size relatively in adaptive code gain when little indication preliminary election and the different dispersal pattern of learning outcome gained dispersal pattern.
As a result, in dispersal pattern bank select bit 415, can adapt to the size of adaptive gain, the M kind dispersal pattern (M=2) of each passage storage of preliminary election, thus can reduce the dispersal pattern number of combinations in a large number.As a result, do not need whole combination calculated distortion, can efficiently carry out the vector quantization processing of noise source information with a spot of computing to dispersal pattern.
Moreover the shape of noise code vector (when sound property is strong) when the adaptive gain value is big is a pulse type, and adaptive gain value hour (when sound property is weak) is shape at random.Therefore, to the sound zone and the cone of silence of voice signal, the noise code vector that can utilize shape to be fit to respectively is so can improve the quality of synthetic speech.
For illustrative ease, the port number N that this example is defined in the pulse vector generation unit is 3, and the species number M that the dispersal pattern bank select bit is deposited each passage diffusion pulse describes under 2 the situation.Yet the dispersal pattern number of each passage and above-mentioned explanation also can obtain same effect and effect not simultaneously in the port number of pulse vector generation unit, the dispersal pattern bank select bit.
For illustrative ease, this example is deposited in the M kind dispersal pattern (M=2) each passage, and a kind of is the dispersal pattern that is obtained by above-mentioned study, and another kind describes for the situation of random pattern.Yet,,, also can expectation obtain same effect and effect even if be not situation as described above if each passage is stored a kind of dispersal pattern of being obtained by study at least in advance.
This example describes the gain situation of the means that size information uses as the preliminary election dispersal pattern of adaptive code to having, if but the parameter of the expression voice signal short time feature beyond the dual-purpose adaptive gain size information can expect to obtain further effect and effect.
In addition, have the voice signal communication system or the voice signal register system of above-mentioned voice coder, effect and effect that the source of sound vector generator that can obtain to put down in writing in the example 1 is had by formation.
Moreover, this example illustrated utilize can reference in moment of noise source information quantization the desirable self-adaptation source of sound gain of current processed frame, the method of preliminary election dispersal pattern, but without the desirable self-adaptation source of sound gain of present frame, and replace utilization when being right after the self-adaption of decoding source of sound gain that former frame obtains, at this moment desirable same structure also can obtain identical effect.
(the 4th example)
Fig. 7 is the functional block diagram of the relevant CELP type of this example voice coder.This example adopts in the noise code book in the CELP type voice coder of the 1st example source of sound vector generator, the available information of the moment that use quantizes in the noise source information vector, a plurality of dispersal patterns that preliminary election dispersal pattern bank select bit is stored.Benchmark as this preliminary election is characterized in that, the size of the coding distortion that produces (representing with the S/N ratio) when using regulation adaptive code this shop.
Except that noise code book periphery, all identical with Fig. 4 CELP type voice coder.Therefore, this example describes the vector quantization processing of noise source information in detail.
As shown in Figure 7, the CELP type voice coder of this example has adaptive codebook 507, adaptive code gain weighted units 509, the noise code book 508, noise code gain weighted units 510, composite filter 505, distortion computation unit 506, sign indicating number regulation unit 513, dispersal pattern bank select bit 515, pulse vector generation unit 516, the pulse vector diffusion unit 517 that are made of the source of sound vector generator that illustrates in the 1st example, spreads vector addition device 518 and distortion power identifying unit 519.
But, in this example, get above-mentioned dispersal pattern bank select bit 515 deposit (M 〉=2) in the M kind dispersal pattern, at least a is random pattern.
For illustrative ease, in this example, the port number N that gets the pulse vector generation unit is 3, the species number M that the dispersal pattern bank select bit is deposited each passage dispersal pattern is 2, and in the hypothesis M kind dispersal pattern (M=2) a kind of be random pattern, another kind of is study in advance, after the quantizing distortion that the quantification of noise source information vector is produced is less, by the dispersal pattern of this learning outcome gained.
In the CELP type voice coder of Fig. 7, decide the processing of adaptive code this shop in the noise source information vector quantification treatment professional etiquette of advancing.Therefore, carrying out the moment that the noise source vector quantization is handled, can retrieve the target vector of usefulness with reference to the vector of adaptive codebook number (adaptive code number), the gain of desirable adaptive code (temporary transient determine) and adaptive codebook.Use the adaptive codebook coding distortion (representing) that to calculate according to above-mentioned three kinds of information in this example, carry out the preliminary election of dispersal pattern with the S/N ratio.
Particularly, after adaptive codebook retrieval finished, the value of adaptive code that sign indicating number regulation unit 513 is kept number and adaptive code gain (The perfect Gain) outputed to distortion computation unit 506 immediately.Distortion computation unit 506 utilizes the adaptive code that receives from sign indicating number regulation unit 513 number and adaptive code gain, and the target vector of adaptive codebook retrieval usefulness, calculates the coding distortion that produced by regulation adaptive code this shop (S/N than).The S/N specific output of calculating is arrived distortion power identifying unit 519.
The S/N that distortion power identifying unit 519 at first carries out receiving from distortion computation unit 506 compares than the size with predefined threshold value.Then, distortion power identifying unit 519 is delivered to dispersal pattern bank select bit 515 according to above-mentioned size result relatively with the control signal of selecting for use in advance.The content of control signal above-mentioned size relatively in S/N when big, study is in advance selected in indication, make noise code book retrieval with target vector encode the coding distortion that produced less after, it is the dispersal pattern of gained as a result, and above-mentioned size relatively in S/N than hour, the dispersal pattern of random pattern is selected in indication.
As a result, in the dispersal pattern bank select bit 515, only preliminary election is a kind of from the M kind dispersal pattern (M=2) of each passage storage, can reduce the combination of dispersal pattern in a large number.Therefore, do not need whole combination calculated distortion, can efficiently stipulate noise code number with a spot of computing to dispersal pattern.Moreover the shape of noise code vector is pulse type at S/N when big, and S/N is than hour being shape at random.Therefore, can make the change of shape of noise code vector, thereby can improve the quality of synthetic speech according to the short time feature of voice signal.
For illustrative ease, the port number N that this example is defined in the pulse vector generation unit is 3, and the species number M that the dispersal pattern bank select bit is deposited each passage diffusion pulse describes under 2 the situation.Yet the kind of the port number of pulse vector generation unit, each passage dispersal pattern and above-mentioned explanation also can obtain same effect and effect not simultaneously.
For illustrative ease, this example is again in the M kind dispersal pattern of each passage storage (M=2), and a kind of is the dispersal pattern of being obtained by above-mentioned study, and another kind describes for the situation of random pattern.Yet, if make the dispersal pattern that each passage is stored a kind of random pattern at least in advance, even if be not that situation as described above also can expectation obtain same effect and effect.
The means that the size information of the coding distortion (representing with the S/N ratio) that produces is used as the preliminary election dispersal pattern though only use in this example by regulation adaptive code number, if but the information of the further correct expression voice signal short time feature of dual-purpose can expect to have further effect and effect.
In addition, have the voice signal communication system or the voice signal register system of above-mentioned voice coder by formation, or the source of sound vector generator that obtains to put down in writing in the 1st example effect and the effect that are had.
The 5th example
The functional block diagram of the relevant CELP type of the 5th example of the present invention shown in Fig. 8 voice coder.In this CELP type voice coder, by input voice data 601 is carried out autocorrelation analysis and lpc analysis, obtain the LPC coefficient in lpc analysis unit 600.In that being encoded, gained LPC coefficient when obtaining the LPC sign indicating number,, obtains decoding LPC coefficient again with the decoding of gained LPC sign indicating number.
Then, at source of sound generation unit 602, take out the source of sound sampling (being called adaptive code vector (or self-adaptation source of sound) and noise code vector (or noise source)) that adaptive codebook 603 and noise code book 604 are deposited, and deliver to LPC synthesis unit 605 respectively.
In LPC synthesis unit 605, to 2 sources of sound that source of sound generation unit 602 is obtained, the decoding LPC coefficient that utilizes lpc analysis unit 600 to be obtained carries out filtering, thereby obtains 2 synthetic speeches.
In comparing unit 606, analyze the relation of 2 synthetic speeches of LPC synthesis unit 605 gained and input speech 601, ask the optimum value (optimum gain) of 2 synthetic speeches, and will adjust each synthetic speech addition behind power with this optimum gain, after obtaining total synthetic speech, calculate the distance of this always synthetic speech and input speech.
Again, sampling to whole sources of sound of adaptive codebook 603 and noise code book 604, calculating is by the distance of a plurality of synthetic speech that drives source of sound generation unit 602 and LPC synthesis unit 605 gained with input speech 601, asks this hour source of sound sampling call number in gained distance as a result.
2 sources of sound of gained optimum gain, source of sound sampling call number and this call number correspondence are delivered to parameter coding unit 607., by after carrying out the optimum gain coding and obtaining gain code LPC sign indicating number, source of sound sampling call number are merged together and deliver to transmission line 608 in parameter coding unit 607.
2 sources of sound according to corresponding with gain code and call number generate actual sound source signal, deposit this signal in adaptive codebook 603, discarded old simultaneously source of sound sampling.
Moreover in the LPC synthesis unit 605, common dual-purpose auditory sensation weighting wave filter, this wave filter adopt linear predictor coefficient, high frequency to strengthen wave filter and long-term forecasting coefficient (obtaining by the input speech is carried out the long-term forecasting analysis).Generally use analystal section is further carried out the source of sound of adaptive codebook and noise code book is retrieved in the interval (being called subframe) of segmentation.
Below, this example is elaborated to the LPC vector quantization of coefficient in the lpc analysis unit 600.
Fig. 9 illustrates and is implemented in the functional block diagram that the Vector Quantization algorithm carried out lpc analysis unit 600 is used.The frame of vector quantization shown in Fig. 9 comprises target extraction unit 702, quantifying unit 703, distortion computation unit 704, comparing unit 705, decoded vector storage unit 707 and vector smooth unit 708.
In target extraction unit 702,, calculate quantified goal according to input vector 701.Now describe the method for extracting target in detail.
Input vector in this example is made of 2 kinds of vectors: analysis of encoding is to the parameter vector of picture frame gained; Analyze the parameter vector of obtaining equally from a future frame.Target extraction unit 702 utilizes above-mentioned input vector and the decoded vector storage unit 707 previous frame decoded vector of depositing, and calculates quantified goal.Formula (8) illustrates the example of operational method.
X(i)={S t(i)+p(d(i)+S t+1(i)/2}/(1+p)????(8)
X (i): target vector
I: vector key element number
S t(i), S T+1(i): input vector
T: time (frame number)
P: weighting coefficient (fixing)
D (i): previous frame decoded vector
The thinking of above-mentioned target extraction method is shown below.In the typical vector quantization with the parameter vector of present frame
S t(i), and carry out match with formula (9) as target X (i). En = Σ i = 0 I ( X ( i ) - Cn ( i ) ) 2 . . . . . . ( 9 )
En: with the distance of n number vector
X (i): quantified goal
Cn (i): code vector
N: code vector number
I: vector dimension
I: vector length
So in the vector quantization hereto, coding distortion still interrelates with the deterioration of tonequality.Even can not avoid in the ultralow bit rate coding of coding distortion to a certain degree in countermeasures such as taking the predictive vector quantification, this becomes big problem.
Therefore, in this example, as the direction that acoustically is difficult to find mistake, the mid point of decoded vector before and after being conceived to is derived decoded vector at this place, realizes the improvement of sense of hearing aspect thus.This is when utilizing the parameter vector interpolation characteristic good, is difficult to hear that the time continuation property is in acoustically this specific character of deterioration.Below with reference to Figure 10 that vector space is shown this situation is described.
At first, the decoded vector of this former frame is d (i), and following parameter vector is S T+1(i) (in fact be preferably following decoded vector, but can not encode in the present frame, so substitute parameter vector), then code vector Cn (i): (1) is code vector Cn (i): (2) are more near parameter vector S t(i), Cn (i) but in fact: (2) are near at d (i) and S T+1(i) on the line, thereby than Cn (i): (1) is difficult for hearing deterioration.So, utilize this specific character, if target X (i) is taken as from S t(i) with to a certain degree near d (i) and S T+1(i) vector on the point midway then is directed to decoded vector the little direction of distortion acoustically.
In this example, can realize moving of this target by the estimation that imports following formula (10).
X(i)={S t(i)+p(d(i)+S t+1(i)/2}/(1+p)?????????????(10)
X (i): quantified goal vector
I: vector key element number
S t(i), S T+1(i): input vector
T: time (frame number)
P: weighting coefficient (fixing)
D (i): previous frame decoded vector
The first half of formula (10) is general vector quantization estimation formula, and latter half is the auditory sensation weighting component.In order to quantize with above-mentioned estimation formula, at each X (i) the estimation formula is carried out differential, and to establish differential gained result be 0, then can get formula (8).
Weighting coefficient P is positive constant, its value be 0 o'clock identical with general vector quantization, target is positioned at mid point fully when infinitely great.P is very big, and then target greatly departs from the parameter vector S of present frame t(i), sense of hearing sharpness descends.According to decoding voice signal audition experiment, confirm that 0.5<p<1.0 o'clock obtain good performance.
Quantified goal at 703 pairs of target extraction unit 702 gained of quantifying unit quantizes, and asks codebook vector, finds the solution code vector simultaneously, and delivers to distortion computation unit 704 together with codebook vector.
In this example, adopt predictive vector to quantize as the method that quantizes.The following describes predictive vector quantizes.
The functional block diagram that predictive vector shown in Figure 11 quantizes.It is that a kind of utilization past Code And Decode gained vector (resultant vector) is predicted that predictive vector quantizes, and this predicated error is carried out the algorithm of vector quantization.
Generate the vector code book 800 of the sampling core (code vector) of a plurality of prediction error vectors of storage in advance.Usually according to a plurality of vectors of analyzing a plurality of voice data gained, utilize LBG algorithm (NO.1, pp84-95, JANUARY 1980 for IEEE TRANSACTIONSON COMMUNICATIONS, VOL.COM-28), generate this code book.
Vector 801 at 802 pairs of quantified goals of predicting unit is predicted.The past resultant vector that prediction utilizes state storage unit 803 to be deposited carries out, and the gained prediction error vector is delivered to metrics calculation unit 804.Here, as the form of prediction, enumerate the prediction that utilizes fixed coefficient to carry out when the prediction number of times is 1 time.Prediction error vector calculating formula when going up this prediction shown in the following formula (11).
Y(i)=X(i)-βD(i)???????????????????(11)
Y (i): prediction error vector
X (i): quantified goal
β: predictive coefficient (scalar)
D (i): the resultant vector of preceding 1 frame
I: vector dimension
The value of predictive coefficient β is generally 0<β<1 in the following formula.
In metrics calculation unit 804, calculate the distance of predicting unit 802 gained prediction error vectors and vector code book 800 code vector of depositing.Following formula (12) illustrates this distance calculation formula. En = Σ i = 0 I ( T ( i ) - Cn ( i ) ) 2 . . . . . ( 12 )
En: with the distance of n number vector
T (i): prediction error vector
Cn (i): code vector
N: code vector number
I: vector dimension
I: vector length
In retrieval unit 805 relatively with the distance of each code vector, will be exported as codebook vector 806 apart from the number of the code vector of minimum.That is, control vector code book 800 and metrics calculation unit 804 are asked the number of vector code book 800 code vector of whole code vector middle distance minimums of depositing, and with this vector number as codebook vector 806.
And then, according to final codebook vector, utilize the past decoded vector of being deposited from the code vector and the state storage unit 803 of vector code book 800 gained, carry out vector decode, and with the content of the resultant vector update mode storage unit 803 of gained.The vector of decoding can be used for prediction when therefore, decoding herein next time.
Formula (13) below utilizing is carried out the decoding of above-mentioned prediction form example (the prediction number of times is 1 time, fixed coefficient).
Z(i)=CN(i)+βD(i)?????????????????????(13)
Z (i): decoded vector (using as D (i) when next time encoding)
N: vector coding
CN (i): code vector
β: predictive coefficient (scalar)
D (i): the resultant vector of preceding 1 frame
I: vector dimension
On the other hand, in demoder,, decode by asking code vector according to the codebook vector that sends.Have vector code book and the state storage unit identical in the demoder in advance with scrambler, utilize with above-mentioned encryption algorithm in the identical algorithm of retrieval unit decoder function, decode.It more than is the vector quantization of carrying out in quantifying unit 703.
The previous frame decoded vector of being deposited according to decoded vector, input vector 701 and the decoded vector storage unit 707 of quantifying unit 703 gained in distortion computation unit 704 is calculated the auditory sensation weighting coding distortion.Following formula (14) illustrates calculating formula.
Ew=∑(V(i)-S t(i)) 2+p{V(i)-(d(i)+S t+1(i)/2} 2???????????(14)
Ew: weighted coding distortion
S t(i), S T+1(i): input vector
T: time (frame number)
I: vector key element number
V (i): decoded vector
P: weighting coefficient (fixing)
D (i): previous frame decoded vector
In formula (14), weighting coefficient p is identical with the coefficient of target extraction unit 702 used target calculating formulas.Above-mentioned weighted coding distortion value, decoded vector and codebook vector are delivered to comparing unit 705.
The codebook vector that comparing unit 705 is sent distortion computation unit 704 here is delivered to transmission line 608, and the decoded vector of sending here with distortion computation unit 704, upgrades the content of decoded vector storage unit 707.
According to above-mentioned example, target vector is modified to from S at target extraction unit 702 t(i) with to a certain degree near d (i) and S T+1The vector of the position of mid point (i), thereby can be weighted retrieval and do not think deterioration acoustically.
So far, illustrated that the present invention is adapted to the situation of used low bit rate voice signal coding technologies such as portable phone, but the present invention's voice signal coding still not, and can also be used for the music encoding device.Interpolation parameter vector quantization preferably in the image encoder.
The LPC of lpc analysis unit coding normally is transformed to the parameter vector that general LSP (line spectrum pair) etc. is convenient to encode in the above-mentioned algorithm, utilizes Euclidean distance and weighting Euclidean distance to carry out vector quantization (VQ).
In this example, target extraction unit 702 is accepted the control of comparing unit 705, and input vector 701 is delivered to vector smooth unit 708, and target extraction unit 703 receives the input vector of revising in the vector smooth unit 708, carries out the extraction of target again.
At this moment, compare weighted coding distortion value and the inner reference value of sending here distortion computation unit 704 of preparing of comparing unit at comparing unit 705.According to this comparative result, processing is divided into two kinds.
When not reaching reference value, the codebook vector that distortion computation unit 704 is sent here is delivered to transmission line 606, and the decoded vector of sending here with distortion computation unit 704, upgrade the content of decoded vector storage unit 707.By rewrite the content of decoded vector storage unit 707 with the decoded vector that obtains, carry out this renewal.Then, carry out the transition to the next frame parameter encoding process.
Otherwise when reference value was above, control vector smooth unit 708 to the input vector correct, worked target extraction unit 702, quantifying unit 703 and vector computing unit 704 once more, carries out recompile.
Before in comparing unit 705, not reaching reference value, carry out encoding process repeatedly.Yet, carry out repeatedly sometimes can not becoming several times not reaching reference value, thereby comparing unit 705 inside have counter, computational discrimination is the above number of times of reference value, reach certain number of times when above, end coding repeatedly, and processing and counter zero clearing when not reaching reference value.
In the vector smooth unit 708, receive the control of comparing unit 705, according to input vector that is obtained by target extraction unit 702 and the previous frame decoded vector that obtains from decoded vector storage unit 707, the formula (15) below utilizing is revised the present frame parameter vector S as one of input vector t(i), and with amended input vector deliver to target extraction unit 702.
S t(i)←(1-q)·S t(i)+q(d(i)+S t+1(i))/2???????????(15)
Above-mentioned q is a smoothing factor, and expression present frame parameter vector is near the degree of the mid point of previous frame decoded vector and future frame parameter vector.Implement according to coding, when the inside number of occurrence higher limit of confirmation 0.2<q<0.4 and comparing unit 705 is 5-8 time, can obtain good performance.
Though this example adopts predictive vector to quantize in quantifying unit 703, by above-mentioned smoothing processing, the possibility that the 704 gained weighted coding distortions of distortion computation unit diminish is big.Its reason is to utilize smoothing processing to make quantified goal more near the previous frame decoded vector.Therefore, utilize the coding that compares unit 705 controls repeatedly, the possibility that does not reach reference value in the distortion relatively of comparing unit 705 improves.
In the demoder, have the decoding unit corresponding in advance, decode according to the codebook vector of sending here from transmission line with the scrambler quantifying unit.
The LSP parameter quantification (quantifying unit is predicted VQ) that this example also is used for the appearance of CELP type coding carries out the Code And Decode experiment of voice signal.Its result confirms that tonequality acoustically can improve certainly, and objective value (S/N ratio) is improved.This is because utilize and to have the level and smooth encoding process repeatedly of vector, even also can suppress to predict the effect of VQ coding distortion when reaching frequency spectrum and sharply changing.The shortcoming that prediction VQ in the past has is: owing to predict according to the past resultant vector, the distortion spectrum of the rapid changing units of frequency spectrum such as part of speech beginning becomes big on the contrary.Yet, use this example, carry out smoothing processing when distortion is big, diminish up to distortion, though thereby target some depart from actual parameter vector, coding distortion diminishes, the effect that overall degradation diminishes in the time of can obtaining the voice signal decoding.Therefore,, not only acoustically improve tonequality, and objective value is improved according to this example.
In this example, can utilize the feature of comparing unit and vector smooth unit, when the vector quantization distortion is big, the direction of its deterioration is controlled on the direction that acoustically more can not perceive, and when quantifying unit adopts predictive vector to quantize by carrying out smoothing processing+coding repeatedly, diminishing up to coding distortion also to make the objective value raising.
So far, illustrated that the present invention is adapted to the situation of used low bit rate voice coding techniquess such as portable phone, but the present invention is not only the voice signal coding, and can be used for interpolation parameter vector quantification preferably in music encoding device and the image encoder.
(the 6th example)
The following describes the relevant CELP type voice coder of the present invention's the 6th example.This example is except that the quantization algorithm of quantization method the adopts quantifying unit that multistage predictive vector quantizes, and other structure is identical with above-mentioned the 5th example.That is, the noise code book adopts the source of sound vector generator of above-mentioned the 1st example.Now describe the quantization algorithm of quantifying unit in detail.
The functional block diagram of quantifying unit shown in Figure 12.In the multi-stage vector quantization, carry out utilizing its code book to decode to quantize gained target code word after target vector quantizes, ask the vector behind the coding and poor (being called the coding distortion vector) of source target, and then the coding distortion vector of trying to achieve is quantized.
Generate in advance and deposit vector code book 899, the vector code book 900 of a plurality of prediction error vector sampling cores (code vector).By prediction error vector to a plurality of study usefulness, use and the identical algorithm of typical case's " multi-stage vector quantization " method for generating codebooks, generate these code books.That is,, utilize LBG algorithm (NO.1, pp84-95, JANUARY 1980 for I EEE TRANSACTIONS ON COMMUNICATIONS, VOL.COM-28) to generate above-mentioned code book usually according to a plurality of vectors of analyzing many voice data gained.But the study group of vector code book 899 is the set of many quantified goals, and the study group of vector code book 900 is for to the set of above-mentioned many quantified goals with the coding distortion vector of quantification code book 899 when encoding.
At first, predict at 902 pairs of quantified goal vectors 901 of predicting unit.Prediction is carried out with state storage unit 903 resultant vectors of depositing over, and the prediction error vector that obtains is delivered to metrics calculation unit 904 and metrics calculation unit 905.
In this example,, enumerate the prediction that utilizes fixed coefficient to carry out when the prediction number of times is 1 time as the prediction form.Prediction error vector arithmetic expression when following formula (16) illustrates with this prediction.
Y(i)=X(i)-βD(i)???????????????????(16)
Y (i): prediction error vector
X (i): quantified goal
β: predictive coefficient (scalar)
D (i): the resultant vector of preceding 1 frame
I: vector dimension
In the following formula, the value of predictive coefficient β is generally 0<β<1.
In metrics calculation unit 904, calculate the distance of predicting unit 902 gained prediction error vectors and the vector code book 899 code vector A that deposits.Following formula (17) illustrates the distance calculation formula. En = Σ i = 0 I ( X ( i ) - C ln ( i ) ) 2 . . . . . ( 17 )
En: with the distance of n number vector A
X (i): prediction error vector
Cln (i): code vector A
N: the number of code vector A
I: vector dimension
I: vector length
In retrieval unit 906, relatively with the distance of each code vector A, will be apart from the number of the code vector A of minimum coding as code vector A.That is, control vector code book 899 and metrics calculation unit 904 are asked the number of the vector code book 899 code vector A of whole code vector middle distance minimums that deposits, and with the coding of this number as code vector A.Then, deliver to metrics calculation unit 905 with the coding of code vector A with reference to the decoded vector A that this coding is obtained from vector code book 899.Again the coding of code vector A is delivered to transmission line, retrieval unit 907.
Metrics calculation unit 905 is according to prediction error vector and the decoded vector A that obtains from retrieval unit 906, obtain the coding distortion vector, perhaps with reference to the coding of the code vector A that obtains from retrieval unit 906, obtain amplitude from amplitude storage unit 908, the code vector B that calculates storage in above-mentioned coding distortion vector and the vector code book 900 then multiply by above-mentioned amplitude gained result's distance, and this distance is delivered to retrieval unit 907.Following formula (18) illustrates the distance calculation formula.
Z(i)=Y(i)-C1N(i) Em = Σ i = 0 I ( Z ( i ) - aNC 2 m ( i ) ) 2 . . . . . . ( 18 )
Z (i): decoding distortion vector
Y (i): prediction error vector
C1N (i): decoded vector A
N: the coding of code vector A
Em: with the distance of m number vector B
The amplitude that aN is corresponding with the coding of code vector A
C2m (i): code vector B
M: the number of code vector B
I: vector dimension
I: vector length
In retrieval unit 907, relatively with the distance of each code vector B, will be apart from the number of the code vector B of minimum coding as code vector B.That is, control vector code book 900 and metrics calculation unit 905, ask the number of the vector code book 900 code vector B of whole code vector B middle distance minimums that deposits, and with the coding of this number as code vector B.Then, the coding of code vector A and code vector B is lumped together, as vector 909.
Retrieval unit 907 is also according to the coding of code vector A, B, the amplitude that obtains with the decoded vector A that obtains from vector code book 899 and vector code book 900 and B, from amplitude storage unit 908, and the past decoded vector of state storage 903 storage carries out the decoding of vector, and utilizes the content of the resultant vector update mode storage unit 903 that obtains.(therefore, when encoding, the vector of decoding is used for prediction herein next time.) (the prediction number of times is 1 time to utilize following formula (19) to carry out the prediction of this example.Fixed coefficient) decoding in.
Z(i)=C1N(i)+aN·C2M(i)+βD(i)??????????????(19)
Z (i): decoded vector (using as D (i) when next time encoding)
N: the coding of code vector A
M: the coding of code vector B
C1M: decoded vector A
C2M: decoded vector B
AN: the amplitude corresponding with the coding of code vector A
β: predictive coefficient (scalar)
D (i): the resultant vector of former frame
I: vector dimension
Preestablish the amplitude of amplitude storage unit 908 storages, this establishing method is shown below.Many voice datas are encoded, and to the 1st grade of code vector each the coding ask total coding distortion of following formula (20) after, learn, make this distortion minimum, thereby the setting amplitude. EN = Σ Σ i = 0 I ( Y t ( i ) - C 1 N ( i ) - aNC 2 m t ( i ) ) 2 . . . . . ( 20 )
EN: code vector A be encoded to N the time coding distortion
N: the coding of code vector A
T: the time that is encoded to N of code vector A
Y t(i): the prediction error vector of time t
C1N (i): decoded vector A
AN: the amplitude corresponding with the coding of code vector A
C2m t(i): code vector B
m t: the number of code vector B
I: vector dimension
I: vector length
That is, behind the coding, set and revise the distortion of above-mentioned formula (20), making the value at each amplitude differential is 0, thus, carries out amplitude study.Then, carry out above-mentioned coding+study repeatedly, thereby obtain optimum range.
On the other hand, in the demoder,, ask code vector, decode by according to transmitting the codebook vector of coming.Demoder has the vector code book identical with scrambler (corresponding to code vector A, B), amplitude storage unit and state storage unit, uses the algorithm identical with the decoding function of retrieval unit (corresponding to code vector B) in the above-mentioned encryption algorithm to decode.
Therefore, in this enforcement shape, utilize the feature of amplitude storage unit and metrics calculation unit to make the 2nd grade code vector adapt to the 1st grade, thereby can make amplitude distortion less with less calculated amount.
So far, illustrated that the present invention is adapted to the situation of used low bit rate voice signal coding technologies such as portable phone, but the present invention is not only the voice signal coding, but also can be used for interpolation parameter vector quantification preferably in music encoding device and the image encoder etc.
(the 7th example)
The following describes the relevant CELP type voice coder of the present invention's the 7th example.Form of the present invention is a kind of examples of encoder, and this scrambler can reduce the operand that adopts the retrieval of ACELP type noise code book time-code.
The functional block diagram of the relevant CELP type of this example shown in Figure 13 voice encryption device.In this CELP type voice coder, 1002 pairs of inputs of filter coefficient analytic unit voice signal 1001 carries out linear prediction analysis, obtains the composite filter coefficient, and gained composite filter coefficient is outputed to filter coefficient quantifying unit 1003.Behind the composite filter coefficient quantization of filter coefficient quantifying unit 1003 with input, output to composite filter 1004.
Composite filter 1004 is to set up according to the filter coefficient that filter coefficient quantifying unit 1003 is supplied with, and is driven by pumping signal 1011.This pumping signal 1011 multiply by noise vector 1009 that adaptive gain 1007 gained results and noise code book 1008 export by the self-adaptation vector 1006 with adaptive codebook 1005 output and multiply by noise gain 1010 gained results added and obtain.
Here, adaptive codebook 1005 is that each pitch period of storage takes out the code book of past to a plurality of self-adaptation vectors of the pumping signal of composite filter, and noise code book 1007 is code books of a plurality of noise vectors of storage.Noise code book 1007 can adopt the source of sound vector generator of above-mentioned the 1st example.
Calculate as the distortion between the synthetic voice signal 1012 of the output of pumping signal 1011 composite filter that drives 1004 and the input voice signal 1001 distortion computation unit 1013, and carry out a yard retrieval process.The sign indicating number retrieval process is the number of a kind of regulation self-adaptation vector 1006 of making the minimum usefulness of 1013 calculated distortion in distortion computation unit and the number of noise vector 1009, the processing of calculating the optimum value of adaptive gain that each output vector is taken advantage of 1007 and noise gain 1010 simultaneously.
The output of coding output unit 1014 be with respectively with the filter coefficient quantized value that obtains from filter coefficient quantifying unit 1003, and the adaptive gain 1007 that multiplies each other of the number of the number of the self-adaptation vector of selecting in the distortion computation unit 1,013 1006 and noise vector 1009 and the result of noise gain 1009 coding back gained.To be transmitted or stored from the information of coding output unit 1014 output.
Sign indicating number retrieval process in the distortion computation unit 1013 is at first retrieved the adaptive codebook component in the pumping signal usually, then this component of noise code in the pumping signal is retrieved.
The orthogonal search that the following describes is used in the retrieval of above-mentioned noise component.
In the orthogonal search, regulation makes retrieval reference value Eort (=Nort/Dort) the maximum noise vector c of formula (21). Eort ( = Nort Dort ) = [ { ( P t H t Hc ) x - ( x t Hp ) Hp } Hc ] 2 ( c t H t Hc ) ( p t H t Hp ) - ( p t H t Hc ) 2 . . . . . ( 21 )
The branch subitem of Nort:Eort
The denominator term of Dort:Eort
P: the self-adaptation vector of having stipulated
H: composite filter matrix of coefficients
H t: the transposed matrix of H
X: echo signal (result of input voice signal and composite filter zero input response difference gained)
C: noise vector
Orthogonal search is that the noise vector that prior regulation self-adaptation vector is a candidate is distinguished quadrature, and stipulates the search method of 1 distortion minimum from a plurality of noise vectors of quadrature.This search method is compared with nonopiate retrieval, it is characterized in that improving the precision of regulation noise vector, thereby can improve the quality of synthetic voice signal.
In the ACELP mode, only the pulse with minority band polarity constitutes noise vector.Utilize this point, the branch subitem (Nort) of retrieving reference value shown in the formula (21) is transformed to following formula (22), thus, can delete the computing of branch subitem.
Nort={a 0ψ(l 0)+a 1ψ(l 1)+…+a n-1ψ(l n-1)} 2??????????(22)
a i: the polarity of i pulse (+1/-1)
l i: the position of i pulse
N: pulse number
ψ:{(p tH tHp)x-(x tHp)Hp}H
The value of ψ in the formula (22) is calculated in advance as pre-treatment, and in array, launched, then (N-1) among the array ψ individual key element can be added symbol and carry out addition, and squared to its result, thus the branch subitem of calculating formula (21).
Specify the distortion computation unit 1013 that can reduce operand below to denominator term.
The functional block diagram of the unit of distortion computation shown in Figure 14 1013.Voice coder in this example, its structure are in the structure of Figure 13, with self-adaptation vector 1006 and noise vector 1009 input distortion computation unit 1013.
In Figure 14, following 3 kinds of processing are carried out in the pre-treatment as to institute's input noise vector calculated distortion the time.
(1) calculates the 1st matrix (N): the power (p that calculates gained vector behind composite filter synthesis self-adaptive vector tH tHp) and the autocorrelation matrix (H of composite filter median filter coefficient tH), and above-mentioned power and each key element of above-mentioned autocorrelation matrix multiplied each other, thereby calculate matrix N (=(p tH tHp) H tH).
(2) calculate the 2nd matrix (M): will be behind composite filter synthesis self-adaptive vector the vector of gained synthetic by the inhour order, and to its signal (p of gained as a result tH tH) get vector product after, calculate matrix M.
(3) generate the 3rd matrix (L): the matrix M of calculating in the matrix N calculated in (1) and (2) is carried out difference, generator matrix L.
Again, the denominator term (Dort) of formula (21) is deployable is formula (23).
Dort=(c tH tHc)(p tH tHp)-(p tH tHc) 2?????????(23)
=c tNc-(r tc) 2
=c tNc-(r tc) t(r tc)
=c tNc-(c trr tc)
=c tNc-(c tMc)
=c t(N-M)c
=c tLc
N:(p tH tHp) H tH ← above-mentioned pre-treatment (1)
R:p tH tH ← above-mentioned pre-treatment (2)
M:rr t← above-mentioned pre-treatment (2)
L:N-M ← above-mentioned pre-treatment (3)
C: noise vector
Thus, the computing method of the denominator term (Dort) during with calculating formula (21) retrieval reference value (Eort) are replaced into formula (23), available this component of less operand regulation noise code.
Matrix L and noise vector 1009 with above-mentioned pre-treatment obtains carry out the calculating of denominator term.
Here, for easy, to input voice signal sampling frequency is 8000Hz, the unit time width (frame time) of Algebraic Structure noise code book retrieval is 10ms, (situation that+1/-1) principle combinations generates illustrates the denominator term computing method based on formula (23) to noise vector with 5 unit pulses of every 10ms.
Establish 5 unit pulses that constitute noise vector again and select the pulse of 1 position to form respectively by being in from the 0th group to the 4th group defined position shown in the table 2, the candidate noise vector can be recorded and narrated with following formula (24).
C=a 0δ(k-l 0)+a 1δ(k-l 1)+…+a 4δ(k-l 4)??????????(24)
(k=0,1,…79)
a i: the polarity of pulse under the i group (+1/-1)
l i: the position of pulse under the i group
Table 2
Group number Symbol The candidate pulse position
????0 ??±1 ????0,??10,??20,??30,??…,??60,??70
????1 ??±1 ????2,??12,??22,??32,??…,??62,??72
????2 ??±1 ????2,??16,??26,??36,??…,??66,??76
????3 ??±1 ????4,??14,??24,??34,??…,??64,??74
????4 ??±1 ????8,??18,??28,??38,??…,??68,??78
At this moment, available following formula (25) is asked the denominator term (Dort) shown in the formula (23). Dort = Σ i = 0 4 Σ j = 0 4 a i a j L ( l i , l j ) . . . . . . ( 25 )
a i: the polarity of pulse under the i group
l i: the position of pulse under the i group
L (l i, l j): l in the matrix L iRow, l jThe key element of row
According to above explanation, prove when adopting ACELP type noise code book, the branch subitem (Nort) of the sign indicating number retrieval reference value of available formula (22) calculating formula (21), available formula (25) is calculated its denominator term (Dort).Therefore, when adopting ACELP type noise code book, not the reference value of former state calculating formula (21), but calculate its minute subitem and denominator term respectively, yard retrieve an operand thereby can cut down significantly with (22) and formula (25).
More than Shuo Ming this example has illustrated the noise code book retrieval that does not have preliminary election.Yet preliminary election makes the big noise vector of value of formula (22), and to utilizing preliminary election to converge to the noise vector calculating formula (21) of a plurality of candidates, selection makes the maximum noise vector of this value, uses the present invention in this case, also can obtain identical effect.

Claims (32)

1. a source of sound vector generator generates the source of sound vector, it is characterized in that this device comprises:
Pulse vector with passage (N 〉=1) of N production burst generates means;
Corresponding to N passage, the storage means of each passage store M kind (M 〉=1) dispersal pattern;
From described storage means, each channel selecting ground takes out the selection approach of the pattern of dispersal pattern;
Every passage carries out the stack computing of dispersal pattern that is taken out and the pulse vector that is generated, thereby generates the diffusion means of N diffusion vector;
According to the N that is generated a diffusion vector, the source of sound vector that generates the source of sound vector generates means.
2. source of sound vector generator as claimed in claim 1 is characterized in that, described pulse vector generates means and generates N pulse vector in the algebraically mode.
3. source of sound vector generator as claimed in claim 1 is characterized in that, the stack computing of described diffusion means based on following formula generates the diffusion vector to each passage. ci ( n ) = Σ k = 0 L - 1 wij ( n - k ) di ( k )
Wherein, n:0~L-1
L: diffusion vector length
I: channel number
J: dispersal pattern number (j=1~M)
Ci: the diffusion vector of passage i
Wij: the j kind dispersal pattern of passage i
Di: the pulse vector of passage i
di:=±δ(n-pi),n=0~L-1
Pi: the candidate pulse position of passage i
4. source of sound vector generator as claimed in claim 1 is characterized in that, described source of sound vector means generate 1 source of sound vector according to following formula from N diffusion vector. c ( n ) = Σ i = 1 N ci ( n )
C: source of sound vector
Ci: diffusion vector
I: channel number (i=1~N)
N: vector key element number (n=0~L-1, wherein L is the source of sound vector length)
5. a CELP type voice coder is used for speech information is encoded, and it is characterized in that, comprising:
Noise code book, this code book have the described source of sound vector generator of claim 1, are used for noise source information vector quantization in addition;
Composite filter, this wave filter generates synthetic speech with the source of sound vector of described source of sound vector generator output as the noise code vector;
Calculate the distortion calculator that institute generates synthetic speech and imports the quantizing distortion of speech;
The means of the pulse position of the pulse of switching formation pulse vector and the combination of pulse polarity and dispersal pattern;
Stipulate that quantizing distortion that described distortion calculator is calculated becomes the combination of minimum pulse position, pulse polarity, dispersal pattern, and produce the means of noise code number.
6. CELP type voice coder as claimed in claim 5, it is characterized in that, to learn in advance, the dispersal pattern of the littler back of the quantizing distortion gained that produces when the noise source information vector is quantized stores in the interior storage means of described source of sound vector generator.
7. CELP type voice coder as claimed in claim 6 is characterized in that, each passage of storage means in the described source of sound vector generator is stored a kind of dispersal pattern of being obtained by study at least.
8. CELP type voice coder as claimed in claim 7 is characterized in that, under the situation of desirable adaptive code yield value greater than predefined threshold value of calculating when self-adaptation source of sound information vector quantizes, selects the dispersal pattern of being obtained by study.
9. CELP voice coder as claimed in claim 7 is characterized in that, under the situation of previous frame self-adaption of decoding source of sound yield value greater than predefined threshold value, selects the dispersal pattern of being obtained by study.
10. CELP type voice coder as claimed in claim 5 is characterized in that, the storage means in the described source of sound vector generator, and at least a dispersal pattern is the random pattern that is formed by the random number vector sequence in its each passage.
11. CELP type voice coder as claimed in claim 5, it is characterized in that, storage means in the described source of sound vector generator, at least a dispersal pattern is for learning in advance in its each passage, the dispersal pattern that the littler back of the quantizing distortion that produces when the noise source information vector is quantized obtains, and at least a be random pattern.
12. CELP type voice coder as claimed in claim 11 is characterized in that, under the situation of coding distortion power greater than predefined threshold value that produces when the regulation adaptive code, selects the diffusion vector of random pattern.
13. CELP type voice coder as claimed in claim 5 is characterized in that, from the M of desirable dispersal pattern NThe combination of the combination of the selected dispersal pattern of regulation each passage of expression number in kind of the full combination, the quantizing distortion minimum that produces when the noise source information vector is quantized.
14. CELP type voice coder as claimed in claim 13, it is characterized in that, combination with the speech parameters preliminary election dispersal pattern of trying to achieve in advance, and the combination of the combination of the selected dispersal pattern of regulation each passage of expression from preliminary election gained dispersal pattern combination number, the quantizing distortion minimum that produces when the noise source information vector is quantized.
15. CELP type voice coder as claimed in claim 14 is characterized in that, according to the voice interval analysis result, switches the combination of the dispersal pattern of preliminary election.
16. CELP type voice coder as claimed in claim 5 is characterized in that, comprising:
Target is extracted means, these means utilize analysis of encoding parameter vector, analysis and the coded object frame of picture frame gained speech parameters to be in a ratio of the parameter vector of following frame gained, and be in a ratio of the decoded vector of previous frame with the coded object frame, calculate the quantified goal vector;
The vector quantization means, these means are encoded the quantified goal vector of being calculated, and obtain the coding of coded object frame.
17. CELP type voice coder as claimed in claim 16 is characterized in that, described target is extracted means and is calculated the quantified goal vector according to following formula.
X(i)={S t(i)+p(d(i)+S t+1(i)/2}/(1+p)
X (i): quantified goal vector
I: vector key element number
S t(i), S T+1(i): parameter vector
T: time (frame number)
P: weighting coefficient (fixing)
D (i): previous frame decoded vector
18. CELP type voice coder as claimed in claim 16 is characterized in that, comprising:
The coding of coded object frame is decoded, with the means of generating solution code vector;
Second distortion calculator, this counter is according to the parameter of described decoded vector and described coded object frame, calculation code distortion;
The level and smooth means of vector, these means are extracted the parameter vector of the coded object frame of means and are carried out smoothing processing offering described target under described coding distortion is situation more than the reference value.
19. CELP type voice coder as claimed in claim 18 is characterized in that, described second distortion calculator calculates the auditory sensation weighting coding distortion according to following formula.
Ew=∑(V(i)-S t(i)) 2+p{V(i)-(d(i)+S t+1(i)/2} 2
Ew: auditory sensation weighting coding distortion
S t(i), S T+1(i): parameter vector
T: time (frame number)
I: vector key element number
V (i): decoded vector
P: weighting coefficient
D (i): previous frame decoded vector
20. CELP type voice coder as claimed in claim 16 is characterized in that, described vector quantization means comprise:
With the at different levels corresponding setting of multipole vector quantization, and store a plurality of code books of a plurality of code vectors;
Calculate the distance that quantizes target vector or its prediction error vector and the 1st grade of code vector that code book is deposited, to obtain the means of the 1st grade of coding;
The amplitude storage unit, this cell stores amplitude corresponding with the 1st grade of code vector that code book is deposited, that represent with scalar;
Multiplication means, these means were taken out the amplitude that is subordinated to the 1st grade of coding from described amplitude storage unit before carrying out the 2nd grade of coding, multiply by the 2nd grade of code vector that code book is deposited;
Calculating to the 1st grade of coding decode gained decoded vector be stored in the 2nd code book and amplitude carried out the distance of the code vector of gained after the multiplying, to obtain the means of the 2nd grade of coding.
21. CELP type voice coder as claimed in claim 5 is characterized in that, described CELP type voice coder also has the adaptive codebook of the self-adaptation vector of storage performance input voice tone component,
Described distortion calculator comprises:
Calculate in the described composite filter the power of the synthetic back of described self-adaptation vector gained signal and the filter coefficient autocorrelation matrix of described composite filter, and calculate the multiply each other means of the 1st matrix of gained of each key element in described power and the described autocorrelation matrix,
The signal of gained is synthesized with the inhour order with synthesizing afterwards to described self-adaptation vector in the described composite filter, and after the vector product of negate hour hands order composite signal, calculates the means of the 2nd matrix, and
Carry out the difference of described the 1st matrix and described the 2nd matrix, to generate the means of the 3rd matrix;
Described distortion calculator utilizes described the 3rd matrix computations distortion.
22. a CELP type sound signal decoder is used for the speech information decoding be is characterized in that, comprising:
Noise code book, this code book have the described source of sound vector generator of claim 1, and the noise code of dispersal pattern combination number and pulse vector combination number according to the rules number, select dispersal pattern, and produce pulse vector;
Composite filter, this wave filter generates synthetic speech with the source of sound vector of described source of sound vector generator output as the noise code vector.
23. CELP type sound signal decoder as claimed in claim 22, it is characterized in that, to learn in advance, the dispersal pattern of the littler back of the quantizing distortion gained that produces when the noise source information vector is quantized stores in the interior storage means of described source of sound vector generator.
24. CELP type sound signal decoder as claimed in claim 23 is characterized in that, each passage of storage means in the described source of sound vector generator is stored a kind of dispersal pattern of being obtained by study at least.
25. CELP type sound signal decoder as claimed in claim 22 is characterized in that, the storage means in the described source of sound vector generator, and at least a dispersal pattern is the random pattern that is formed by the random number vector sequence in its each passage.
26. CELP type sound signal decoder as claimed in claim 22, it is characterized in that, storage means in the described source of sound vector generator, at least a dispersal pattern is for learning in advance in its each passage, the dispersal pattern that the littler back of the quantizing distortion that produces when the noise source information vector is quantized obtains, and at least a be random pattern.
27. a source of sound vector generation method generates the source of sound vector, it is characterized in that, comprising:
Generate the step of the vector (N 〉=1) of N channel pulse;
From corresponding to N passage, the storage means of each passage store M kind (M 〉=1) dispersal pattern, the step that each channel selecting ground takes out dispersal pattern;
Each passage carries out the stack computing of dispersal pattern that is taken out and the pulse vector that is generated, to generate the step of N diffusion vector;
Generate the step of source of sound vector from N the diffusion vector that is generated.
28. a voice signal coding/decoding method is used in the CELP mode speech information being encoded, and it is characterized in that, comprising:
Utilize the described source of sound vector generator of claim 1, generate the step that is used for the noise code vector of noise source information vector quantification;
The source of sound vector of described source of sound vector generator output is generated the step of synthetic speech as the noise code vector;
Calculate the quantizing distortion step that institute generates synthetic speech and imports speech;
The step of the pulse position of the pulse of switching formation pulse vector and the combination of pulse polarity and dispersal pattern;
Regulation quantizing distortion becomes the step of the combination of minimum pulse position, pulse polarity, dispersal pattern.
29. a voice signal coding/decoding method, the speech information that is used for encoding in the mode of CELP is decoded, and it is characterized in that, comprising:
Utilize the described source of sound vector generator of claim 1, the step of generted noise code vector;
The source of sound vector of described source of sound vector generator output is generated the step of synthetic speech as the noise code vector.
30. a vector quantization method quantizes input vector, it is characterized in that, comprising:
According to input vector and the decoded vector in the past that a plurality of vectors continuous in time are formed, calculate the step of quantified goal vector;
Described quantified goal vector is encoded, obtain its coding, simultaneously described coding is decoded, and obtain the step of decoded vector;
According to obtained decoded vector and described input vector, the step of calculated distortion;
Regulation makes the step of the coding of described distortion minimum;
The step of storage decoded vector;
Utilize and the corresponding decoded vector of final coding, upgrade the step of decoded vector.
31. a communicator is characterized in that having the described CELP type of claim 5 voice encryption device.
32. a communicator is characterized in that having the described CELP type of claim 22 voice decoder.
CNB988015560A 1997-02-13 1998-10-22 Sound encoder and sound decoder Expired - Lifetime CN100367347C (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
JP9028941A JPH10228491A (en) 1997-02-13 1997-02-13 Logic verification device
JP289412/1997 1997-10-22
JP289412/97 1997-10-22
JP295130/1997 1997-10-28
JP295130/97 1997-10-28
JP85717/98 1998-03-31
JP85717/1998 1998-03-31

Related Child Applications (9)

Application Number Title Priority Date Filing Date
CNB2005100062028A Division CN100349208C (en) 1997-10-22 1998-10-22 Speech coder and speech decoder
CN2006100048275A Division CN1808569B (en) 1997-10-22 1998-10-22 Voice encoding device,orthogonalization search method, and celp based speech coding method
CN200710307317XA Division CN101202046B (en) 1997-10-22 1998-10-22 Sound encoder and sound decoder
CN2007103073184A Division CN101202047B (en) 1997-10-22 1998-10-22 Sound encoder and sound decoder
CN2007103073150A Division CN101202044B (en) 1997-10-22 1998-10-22 Sound encoder and sound decoder
CN2007101529987A Division CN101174413B (en) 1997-10-22 1998-10-22 Sound signal encoder and sound signal decoder
CN2007101529972A Division CN101174412B (en) 1997-10-22 1998-10-22 Sound encoder and sound decoder
CN2007103073381A Division CN101221764B (en) 1997-10-22 1998-10-22 Sound encoder and sound decoder
CN2007103073165A Division CN101202045B (en) 1997-10-22 1998-10-22 Sound encoder and sound decoder

Publications (2)

Publication Number Publication Date
CN1242860A true CN1242860A (en) 2000-01-26
CN100367347C CN100367347C (en) 2008-02-06

Family

ID=12262442

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB988015560A Expired - Lifetime CN100367347C (en) 1997-02-13 1998-10-22 Sound encoder and sound decoder

Country Status (2)

Country Link
JP (1) JPH10228491A (en)
CN (1) CN100367347C (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847414A (en) * 2003-12-19 2010-09-29 摩托罗拉公司 The method and apparatus that is used for voice coding
CN102598124A (en) * 2009-10-30 2012-07-18 松下电器产业株式会社 Encoder, decoder and methods thereof
CN116781180A (en) * 2023-06-05 2023-09-19 广州市高科通信技术股份有限公司 PCM channel capacity expansion method and capacity expansion system
CN117577121A (en) * 2024-01-17 2024-02-20 清华大学 Diffusion model-based audio encoding and decoding method and device, storage medium and equipment

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7761822B2 (en) * 2007-03-19 2010-07-20 Fujitsu Limited File information generating method, file information generating apparatus, and storage medium storing file information generation program
JP5229119B2 (en) 2009-06-10 2013-07-03 富士通株式会社 Model generation program, method and apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2956068B2 (en) * 1989-04-21 1999-10-04 日本電気株式会社 Audio encoding / decoding system
JP2946525B2 (en) * 1989-04-25 1999-09-06 日本電気株式会社 Audio coding method
JP3024455B2 (en) * 1992-09-29 2000-03-21 三菱電機株式会社 Audio encoding device and audio decoding device
JP3223943B2 (en) * 1994-06-16 2001-10-29 日本電信電話株式会社 Vector code decoding method
JPH09160536A (en) * 1995-12-08 1997-06-20 Columbia Onkyo Kogyo Kk Keyed instrument

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101847414A (en) * 2003-12-19 2010-09-29 摩托罗拉公司 The method and apparatus that is used for voice coding
CN101847414B (en) * 2003-12-19 2016-08-17 谷歌技术控股有限责任公司 Method and apparatus for voice coding
CN102598124A (en) * 2009-10-30 2012-07-18 松下电器产业株式会社 Encoder, decoder and methods thereof
CN102598124B (en) * 2009-10-30 2013-08-28 松下电器产业株式会社 Encoder, decoder and methods thereof
US8849655B2 (en) 2009-10-30 2014-09-30 Panasonic Intellectual Property Corporation Of America Encoder, decoder and methods thereof
CN116781180A (en) * 2023-06-05 2023-09-19 广州市高科通信技术股份有限公司 PCM channel capacity expansion method and capacity expansion system
CN116781180B (en) * 2023-06-05 2023-11-10 广州市高科通信技术股份有限公司 PCM channel capacity expansion method and capacity expansion system
CN117577121A (en) * 2024-01-17 2024-02-20 清华大学 Diffusion model-based audio encoding and decoding method and device, storage medium and equipment
CN117577121B (en) * 2024-01-17 2024-04-05 清华大学 Diffusion model-based audio encoding and decoding method and device, storage medium and equipment

Also Published As

Publication number Publication date
JPH10228491A (en) 1998-08-25
CN100367347C (en) 2008-02-06

Similar Documents

Publication Publication Date Title
CN1632864A (en) Speech coder and speech decoder
CN1296888C (en) Voice encoder and voice encoding method
CN1223994C (en) Sound source vector generator, voice encoder, and voice decoder
CN1286086C (en) Variable speed sounder
CN1156822C (en) Audio signal coding and decoding method and audio signal coder and decoder
CN1131507C (en) Audio signal encoding device, decoding device and audio signal encoding-decoding device
CN1163870C (en) Voice encoder, voice decoder, recording medium on which program for realizing voice encoding/decoding is recorded and mobile communication apparatus
CN1160703C (en) Speech encoding method and apparatus, and sound signal encoding method and apparatus
CN1245706C (en) Multimode speech encoder
CN1265355C (en) Sound source vector generator and device encoder/decoder
CN1338096A (en) Adaptive windows for analysis-by-synthesis CELP-type speech coding
CN1898724A (en) Voice/musical sound encoding device and voice/musical sound encoding method
CN1808569A (en) Voice encoding device,orthogonalization search, and celp based speech coding
CN1216367C (en) Data processing device
CN1242860A (en) Sound encoder and sound decoder
CN1877698A (en) Excitation vector generator, speech coder and speech decoder
CN1873779A (en) Speech coding/decoding apparatus and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1025417

Country of ref document: HK

ASS Succession or assignment of patent right

Owner name: INTELLECTUAL PROPERTY BRIDGE NO. 1 CO., LTD.

Free format text: FORMER OWNER: MATSUSHITA ELECTRIC INDUSTRIAL CO, LTD.

Effective date: 20140606

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20140606

Address after: Tokyo, Japan

Patentee after: GODO KAISHA IP BRIDGE 1

Address before: Japan's Osaka kamato City

Patentee before: Matsushita Electric Industrial Co., Ltd.

CX01 Expiry of patent term

Granted publication date: 20080206

CX01 Expiry of patent term