WO2009059513A1 - Procédé de codage, codeur et support lisible par ordinateur - Google Patents
Procédé de codage, codeur et support lisible par ordinateur Download PDFInfo
- Publication number
- WO2009059513A1 WO2009059513A1 PCT/CN2008/072371 CN2008072371W WO2009059513A1 WO 2009059513 A1 WO2009059513 A1 WO 2009059513A1 CN 2008072371 W CN2008072371 W CN 2008072371W WO 2009059513 A1 WO2009059513 A1 WO 2009059513A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- search
- type
- codebook
- pulses
- code book
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0013—Codebook search algorithms
Definitions
- the present invention relates to vector coding techniques, and more particularly to an encoding method, an encoder, and a computer readable medium.
- the residual signal is typically quantized using a fixed codebook search.
- a commonly used fixed codebook is a digital book.
- Generational digital books focus on the pulse position of the target signal. The amplitude of the pulse is defaulted to 1, so only the sign and position of the pulse need to be quantized; of course, different amplitudes can be represented by superimposing multiple pulses at the same position.
- One of the important points in the quantization coding using a digital book is to search for the position of each pulse of the best generation digital book corresponding to the target signal.
- depth-first tree search (Depth-First Tree Search Procedure)
- the number of pulses to be searched varies depending on the code rate, assuming N. If you do not impose other restrictions, searching for N pulses in 64 locations is computationally too complex. To this end, the pulse position of the generation digital book is constrained, and 64 positions are divided into M tracks (Track). A typical track division method is shown in Table 1.
- T3 3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63
- "TO" ⁇ "T3" is 4 orbits
- "Positions Numberer of locations included on each track.
- 64 positions are divided into 4 tracks, each track has 16 positions, and the pulse positions of the 4 tracks are interlaced to maximize the combination of various pulse positions.
- N 4 searching for 1 pulse per track, and other cases can be analogized.
- the first level search is performed on T0-T1, T2-T3.
- P1 searches among the 16 positions of the track T1; the optimum positions of P0 and P1 are determined from the searched 4x16 kinds of position combinations according to the set evaluation criteria (for example, the cost function Qk).
- the positions of P2 and P3 are then searched on T2-T3, where P2 searches at 8 of the 16 positions of track T2, which are determined by the extrema of the known reference signal on the track.
- P3 searches through 16 positions of the track T3, and finally determines the best positions of P2 and P3 to complete the search of this level.
- the second level search is performed on T1-T2, T3-T0, and the process is similar to the first level search.
- the third level search is also performed on T2-T3, T0-T1, and the fourth level search is performed on T3-T0, T1-T2.
- the codebook structure used is the same as in the previous algorithm 1, and it is also necessary to search for one pulse on each of the four tracks, and the pulses searched on TO ⁇ T3 are respectively P0 to P3.
- the largest one of the four largest Qk values obtained in the above process is taken as the global optimal value, and the corresponding codebook is used as the best codebook for the current round of search, which is assumed to be ⁇ 20, 21, 42, 7 ⁇ .
- the codebook search algorithm used in various existing coding techniques is difficult to achieve satisfactory effects in terms of computational complexity and performance.
- the depth-first tree search algorithm can obtain good speech quality under various code rates, it has more search times and more computational complexity.
- the global pulse substitution method is easy to fall into, although the computational complexity is low. Local maximum, unstable performance, good quality in some signal cases, and poor quality in other signal cases.
- An encoding method comprising: acquiring a characteristic parameter of an input signal; determining a type of the input signal according to the characteristic parameter; obtaining a vector to be quantized according to the characteristic parameter; and adopting a corresponding code book according to the determined type of the input signal
- the search algorithm performs a codebook search on the vector to be quantized.
- An encoder includes: a feature parameter acquiring unit, configured to acquire a feature parameter of an input signal; a signal type determining unit, configured to determine a type of the input signal according to the characteristic parameter; a vector generating unit, configured to generate a vector to be quantized according to the feature parameter; and a determining unit, configured to determine an input according to the signal type determining unit
- the type of the signal is selected by a corresponding codebook search algorithm to perform a codebook search on the vector to be quantized.
- a computer readable storage medium comprising computer program code, the computer program code being executed by a computer unit, the computer unit: obtaining a characteristic parameter of an input signal; determining a type of the input signal based on the characteristic parameter; The parameter obtains a vector to be quantized; according to the determined type of the input signal, a codebook search is performed on the vector to be quantized by using a corresponding codebook search algorithm.
- the above encoding method or apparatus employs a method of selecting different codebook search algorithms according to different input signal types. Since the appropriate search algorithm can be selected according to the characteristics of the input signal, some signal types that can obtain satisfactory results by simple calculation can be matched with the search algorithm which is suitable for the type and has low computational complexity, with less system resources. Better performance is achieved; at the same time, other signal types that require more complex calculations can be processed by better quality search algorithms, ensuring the quality of the coding.
- FIG. 1 is a schematic diagram of a conventional depth-first tree search method
- FIG. 2 is a schematic flow chart of an embodiment of an encoding method of the present invention.
- FIG. 3 is a schematic diagram showing the logical structure of an embodiment of an encoder of the present invention.
- FIG. 4 is a schematic flow chart of a first embodiment of a codebook search algorithm according to the present invention.
- Embodiment 2 is a schematic flow chart of Embodiment 2 of a codebook search algorithm of the present invention.
- FIG. 6 is a schematic flowchart of a third embodiment of a codebook search algorithm according to the present invention.
- FIG. 7 is a schematic flow chart of a fourth embodiment of a codebook search algorithm according to the present invention.
- FIG. 8 is a schematic flow chart of Embodiment 5 of the codebook search algorithm of the present invention.
- Embodiments of the present invention provide an encoding method for selecting different codebook search algorithms according to different input signal types.
- the embodiment of the invention also provides a corresponding encoder.
- the embodiments of the present invention are respectively The method and device are accompanied by a detailed description.
- an embodiment of the encoding method of the present invention includes the steps of:
- Step 1 Obtain the characteristic parameters of the input signal.
- the input signal encoded in this embodiment may be an adaptively filtered residual signal based on the CELP model, and similar other speech or tone signals suitable for vector quantization coding.
- the so-called characteristic parameter is data used to describe the characteristics of a certain aspect of the input signal.
- the feature parameters are usually analyzed and extracted in units of frames, and the frame size can be selected according to application needs and signal characteristics.
- the selectable range of the characteristic parameters includes, but is not limited to, a linear prediction parameter (LPC: Liner Prediction Coefficient), a linear prediction cepstrum coefficient (LPCC), a pitch period parameter, a frame energy, an average zero-crossing rate, and the like.
- LPC Liner Prediction Coefficient
- LPCC linear prediction cepstrum coefficient
- Step 2. Determine the type of the input signal according to the characteristic parameters of the input signal.
- the input signal can be classified based on different judgment methods, for example, by using different characteristic parameters or combinations of characteristic parameters.
- the basis of the judgment, or the setting of the different feature parameter thresholds in the judgment, etc. is not limited in this embodiment, and may be set according to the actual application.
- a feasible classification method is to determine the specific feature parameters and the classification criteria of the classification based on the characteristics of the candidate search algorithm.
- the characteristic parameters embodying the characteristics of the input signal period can be classified as a classification basis, and the types of the input signals are classified into a type having a periodic characteristic and a type having a white noise characteristic, and a lower complexity search is used for a signal having a periodic characteristic.
- the algorithm uses a higher complexity search algorithm for signals with white noise characteristics.
- the input signal is divided into four different frame types, namely, an unvoiced frame, a voiced frame, a general frame, and a transition frame, wherein the voiced frame and the transition frame can also be combined into one type.
- the unvoiced frame and the general frame belong to a type having a white noise characteristic
- the voiced frame and the transition frame belong to a type having a periodic feature.
- Pitch period parameters can be used, such as the average amplitude difference function (AMDF: Average Magnitude)
- Difference Function to evaluate the periodic characteristics of the input signal, to initially distinguish between types with periodic features and types with white noise characteristics.
- the average zero-crossing rate can also be used alone or in addition to the judgment.
- the average of periodic signals is The zero rate is less than the average zero crossing rate of the white noise signal;
- the frame energy can be used to determine the unvoiced frame and the general frame.
- the frame energy of the unvoiced frame is lower than the frame energy of the common frame, and the threshold can be set to determine;
- AMDF can be further analyzed to distinguish between voiced frames and transition frames, or to use the average zero-crossing rate range of the subdivision to distinguish, of course, if the voiced and merged frames are combined into one Types, you don't have to subdivide.
- class division and decision methods are only examples.
- appropriate feature parameters and decision sequences can be selected according to application requirements and signal characteristics. For example, classification can be performed according to frame energy, and then structural parameter parameters are used for segmentation. .
- Step 3 Generate a vector to be quantized according to the characteristic parameter of the input signal.
- step 3 can be done with reference to the existing method. Moreover, this step 3 and step 2 are not logically related in sequence, and may be executed in sequence or in parallel with step 2.
- Step 4 According to the determined type of the input signal, select a corresponding codebook search algorithm to perform a codebook search on the vector to be quantized.
- a codebook search algorithm suitable for its characteristics can be configured for various types of signals.
- a codebook search algorithm with higher complexity and better performance is used for the unvoiced frame signal, such as a random codebook search algorithm or a depth-first tree search algorithm described in the background art;
- a codebook search algorithm with higher complexity and better performance is used for the general frame, such as the depth-first tree search algorithm described in the background art;
- a less complex codebook search algorithm is used for the voiced frame and/or the transition frame signal, for example, a codebook search algorithm based on pulse position replacement, which may specifically be a global pulse replacement algorithm described in the background art; of course, if the voiced frame is The transition frame is subdivided into two different signal types, and different codebook search algorithms can also be configured separately.
- the codebook search algorithm can be used to perform a codebook search using the determined codebook search algorithm.
- Detailed description referring to Figure 3, includes:
- the feature parameter obtaining unit 101 is configured to acquire a feature parameter of the input signal.
- the signal type determining unit 102 is configured to determine the type of the input signal according to the feature parameter provided by the feature parameter acquiring unit 101.
- the vector generating unit 103 is configured to generate a vector to be quantized according to the feature parameter provided by the feature parameter acquiring unit 101.
- At least two codebook search units are included (this embodiment includes a plurality of codebook search units 1 to n as an example, and the unified reference numeral is 104 in FIG. 3), and each codebook search unit is used to provide different codebook search algorithms.
- the codebook search unit 1 is for providing a depth-first tree search algorithm
- the codebook search unit 2 is for providing a codebook search algorithm based on pulse position replacement).
- the determining unit 105 is configured to select, according to the type of the input signal determined by the signal type determining unit 102, a different codebook search algorithm (the present embodiment takes the selected codebook search unit 104 as an example) to generate the to-quantization generated by the vector generating unit 103.
- the vector performs a codebook search.
- the decision unit 105 determines that the type of the input signal is of a type having a periodic feature
- the code book search unit 2 is selected to perform a codebook search; if the decision unit 105 determines that the type of the input signal is of a type having a white noise characteristic, Select code book search unit 1 to perform code book search.
- two codebook search units are optional. If yes, the determining unit is configured to select a corresponding codebook search according to the type of the input signal determined by the signal type determining unit. The algorithm performs a codebook search on the vector to be quantized.
- the type of the input signal determined by the signal type determining unit 102 may include a type having a periodic feature and a type having a white noise characteristic;
- the code book search unit 104 may include a first type code book search unit and a second type code book search unit, wherein the code book search algorithm provided by the first type code book search unit has lower computational complexity than the second type code
- the computational complexity of the codebook search algorithm provided by the book search unit; the function of the decision unit 105 is specifically to select the first type of codebook search unit according to the type having the periodic feature, and select the second type of codebook according to the type having the white noise feature Search unit.
- the type of white noise characteristic determined by the signal type determining unit 102 may be subdivided into an unvoiced frame and a general frame; the determined type having periodic features may include a voiced frame and/or Or transition frame;
- the second type codebook search unit in the code book search unit 104 may include a random code book search unit and a depth-first search unit; wherein the random code book search unit is configured to provide a random code book search algorithm, a depth-first search unit For providing a depth-first tree search algorithm; the first type of codebook search unit in the codebook search unit 104 may include a pulse replacement search unit for providing a codebook search algorithm based on pulse position replacement;
- the function of the decision unit 105 is specifically to select a depth-first search unit based on the general frame and/or the unvoiced frame; to replace the search unit with the voiced frame and/or the transition frame selection pulse.
- the above described encoding method or apparatus embodiment employs a method of selecting different codebook search algorithms based on different input signal types. Since the appropriate search algorithm can be selected according to various possible structural characteristics of the input signal, some signal types that can obtain satisfactory results by simple calculation can be matched with the search algorithm that is suitable for the type and has low computational complexity. Less system resources get better performance; at the same time, other signal types that require more complex calculations can be processed by better quality search algorithms, ensuring the quality of the coding.
- a codebook search algorithm based on pulse position replacement is presented, which can be used as a codebook search algorithm with lower complexity and higher performance in the coding technique of the present invention.
- Codebook search algorithm embodiment 1 reference to Figure 4, including steps:
- A1 Acquire a basic codebook, where the basic codebook includes position information of N pulses on M tracks, and N and M are positive integers.
- the basic code book referred to in this article is the initial use of the search as a basis for a round of search. Code book. Generally, before the generation of the digital book pulse position search, the number distribution of the pulses to be searched on the respective tracks has been determined based on the information such as the code rate.
- obtaining the base code book is to obtain the initial position of each pulse on each track.
- the initial position of the pulse can be determined in various ways, and the embodiment of the code search algorithm is not limited. For example, you can:
- an optional reference signal is a "pulse position maximum likelihood function" (also called a pulse amplitude selection signal), and the function can be expressed as:
- d(i) is the dimensional component of the vector signal d determined by the target signal to be quantized, and can generally be expressed as a convolution of the target signal with the impulse response of the pre-filtered weighted synthesis filter; r LTP (i) Is the long-term predicted residual signal r of each dimension component; E d is the energy of the signal d; E r is the energy of the signal r; a is a scaling factor, which controls the dependence of the reference signal d(i), for different The code rate can vary in value.
- the different values of b(i) at 64 positions can be calculated, and the position where b(i) takes the largest value in TO ⁇ T3 is selected as the initial position of the pulse.
- selecting n pulses as search pulses, the n pulses being part of the N pulses, n being a positive integer less than N the specific process is: selecting n search pulses from Ns pulses, Ns pulses are all or part of the N pulses, Ns is a positive integer less than or equal to N, and n is a positive integer less than Ns; the position of the pulse other than the n search pulses in the fixed base codebook will The positions of the n search pulses are respectively replaced with other positions on the track to obtain a search code book.
- the pulse that can be selected as the search pulse can be all N pulses, or just a part thereof, and a set of "pulses that can be selected as search pulses" is hereinafter referred to as "Ns set".
- Ns set a set of "pulses that can be selected as search pulses"
- n search pulses from the Ns pulses can be performed by various selection methods.
- the code search algorithm is not limited in the embodiment. For example, you can:
- n is greater than or equal to 2, randomly select the combination of search pulses
- the search pulse is P0, Pl, P2;
- the search pulse is P0, P2, P3; P0, Pl, P3;
- the search pulses are PI, P2, P3.
- the corresponding position in the base code book is replaced with other positions on the track in which it is located, and the search code book is obtained.
- the position for replacement on the track to be searched may be all positions on the track, or may only include the position in the selectable range, for example, according to a known reference signal. The value of the selection is selected from the track being searched for a replacement.
- step A3 The search process of step A2 is performed K times as a round, and K is a positive integer greater than or equal to 2, wherein at least one search pulse is selected in the search process, and the search pulse selected in each search is incomplete. the same.
- the number K of the loop execution of step A2 may be an upper limit value that is specifically set, and it is considered that a round of search is completed after performing K search processes.
- the embodiment of the present invention may also limit the K value, that is, the value of the threshold is not determined, and the completion of a round of search is determined by a certain search termination condition, for example, when the selected search pulse has traversed the Ns set. Judge the completion of a round of search.
- the search termination condition is not fulfilled and it is considered to complete a round of search.
- the specific rules may be set according to the actual application, and the embodiment of the code search algorithm is not limited.
- the codebook search algorithm embodiment requires at least one of the K search processes to perform two or more pulses, and the selected search pulses may be distributed in the same Or on different tracks.
- A4 Select the best codebook of the current round from the basic codebook and the search codebook according to the set evaluation criteria.
- the process of comparing and evaluating the search code book and the base code book can be performed in synchronization with the process of the step A2 search.
- a "preferred code book" can be set and its value initialized to the base code book; then, after obtaining a search code book, it is compared with the current preferred code book, if it is determined that the search code book is better than the preferred code book.
- the book replaces the current preferred code book with the search code book; until all K search processes are completed, the obtained preferred code book is the best code book of the current round. It should be noted that the basis of each search process is still the basic code book, but the object of comparative evaluation is the preferred code book.
- K search process It is also possible to focus on the comparison of the results of the K search process. For example, a preferred codebook obtained for each search process can be saved, and then K preferred codebooks are collectively compared, from which the best codebook of the current round is selected.
- the criteria for comparing and evaluating the search code book and the basic code book may be determined according to the application situation, and the code search algorithm embodiment is not limited.
- a cost function (Qk), which is usually used to measure the quality of a digital book, can be used for comparison. It is generally considered that the larger the Qk value, the better the quality of the codebook, so that a codebook having a large Qk value can be selected as a preferred codebook.
- the second embodiment of the codebook search algorithm provides a specific search pulse selection method based on the first embodiment of the codebook search algorithm. Referring to FIG. 5, the steps include:
- step Bl Acquire a basic codebook, where the basic codebook includes position information of N pulses on M tracks, and N and M are positive integers. This step can be performed by referring to step A1 in the first embodiment of the codebook search algorithm.
- Ns pulses the meaning of Ns is the same as that in the first embodiment of the codebook search algorithm, and ⁇ is a value greater than or equal to 2, and remains unchanged in the current round of search;
- the selected ⁇ search pulses are one of all possible combinations of C s and the selection is not repeated.
- select 2 search pulses from the Ns set to share C s 6 combinations, including: P0, PI; P0, P2; P0, P3; Pl, P2; Pl, P3; P2, P3 .
- the selection can be made from these 6 combinations randomly or sequentially; in order to make the selection non-repetition each time, the selection can be sequentially performed according to the change rule of the combination, or all the combinations can be saved or all the combinations can be numbered, and the selected combination is selected. (or number) deleted.
- step B2 The search process of step B2 is performed K times as one round, 2 ⁇ K ⁇ C S , wherein two or more search pulses are selected in at least one search process, and the search pulses selected in each search are not all the same.
- n Since the value of n is fixed, and each chosen combination of search pulses are not repeated, so most search C s Ns times can traverse the entire set of possible combinations.
- the upper limit of the K value it is also possible to limit the upper limit of the K value to less than C s , at which point all possible combinations will not be fully traversed, but the selected search pulse may still traverse the Ns set.
- This step can be performed by referring to step A4 in the first embodiment of the codebook search algorithm.
- the third embodiment of the codebook search algorithm provides a method for performing cyclic multi-round execution based on the first and second embodiments of the codebook search algorithm. Referring to FIG. 6, the method includes the following steps:
- the basic codebook includes position information of N pulses on M orbits, and N and M are positive integers.
- This step can be performed by referring to step A1 in the first embodiment of the codebook search algorithm.
- step C3. Determine whether the number of rounds of the search G reaches the set upper limit of the G value, and if yes, execute step C5, otherwise, execute step C4.
- step C4 Replace the original base code book with the best code book as a new base code book, and return to step C2 to continue searching for a new round of the best code book.
- the fourth embodiment of the codebook search algorithm provides another method for performing multiple rounds of execution on the basis of the first and second embodiments of the codebook search algorithm. Referring to FIG. 7, the steps are as follows:
- D1 Acquire a basic code book, where the basic code book includes position information of N pulses on M tracks, and N and M are positive integers.
- This step can be performed by referring to step A1 in the first embodiment of the codebook search algorithm.
- This step can be performed by referring to steps A2 to A4 in the first embodiment of the codebook search algorithm, or by referring to steps B2 to B4 in the second embodiment of the codebook search algorithm.
- Ns N can be set in the first round of search.
- step D3 Determine whether the number of rounds of the search G reaches the set upper limit of the G value, or determine whether the Ns set of the next round is empty. If yes, execute step D5, otherwise execute step D4.
- the Ns set of each round can be determined according to the search result of the previous round. For the specific determination method, see step D4. If the Ns set is empty, the search can be considered complete; or the search can be completed based on the set G value P ⁇ when the Ns set is not empty.
- Ns pulses return to step D2 to continue searching for the new round of the best codebook.
- the fifth embodiment of the codebook search algorithm provides a method for obtaining an initial basic codebook based on the foregoing embodiments of the codebook search algorithm. Referring to FIG. 8, the method includes the following steps:
- the total number of pulses N to be searched and the number of pulses distributed on each track are determined.
- E2 Determine a centralized search range for each track according to a plurality of extreme values of the known reference signals on the respective tracks, the centralized search range including at least one position on the track.
- the reference signal can select the pulse position maximum likelihood function b(i), and can calculate the different values of b(i) at all pulse positions, and select several positions with the largest value of b(i) on each track as the respective
- the centralized search range of the track can contain the same number of locations or different.
- the centralized search range of the basic codebook is:
- the centralized search range is usually small, a full search can be performed therein to obtain a better base code book.
- a total of 4x4x4x4 256 times is required to obtain the base codebook.
- This step can be performed by referring to steps ⁇ 2 to ⁇ 4 in the first embodiment of the codebook search algorithm, or by referring to steps ⁇ 2 to ⁇ 4 in the second embodiment of the codebook search algorithm.
- the initial codebook is searched from the centralized search range of four positions in each track, and the assumption is ⁇ 32, 33, 2,
- the quality of the speech obtained by the method is comparable.
- the number of searches for the above method is 560 times, which is much smaller than the number of searches for the depth-first tree search method by 768 times.
- the codebook search algorithm embodiment provided by the present invention selects an optimal codebook by performing a substitute search method on different pulse combinations, and at least one search performs for a plurality of pulses. Since the optimal codebook is selected from a plurality of different combinations of replacements, the number of searches can be reduced while ensuring the globality of the search as much as possible; and since at least one search is performed on a plurality of pulses, the correlation between the pulses is made. The impact on search results can be considered to further ensure the quality of the search results.
- the selection method of the search pulses is optimized, so that the search process is more effective, and the search can be further enhanced if the possible combinations of the search pulses can be further traversed.
- the overall meaning of the results improve the quality of search results.
- the multi-round search method is used to obtain the final best code book, the quality of the search result can be further improved.
- the range of the Ns set in the next round of search is reduced according to the search result of the previous round, which can effectively reduce the calculation amount. If the initial base code book is further obtained by the centralized search method, a higher quality base code book can be obtained, and the quality of the search result is further improved.
- the following is an experimental evaluation of the application method of the encoding method and the encoder embodiment of the present invention in a classification-based encoder that classifies signals into unvoiced, general, voiced, and transitional classes, but all types of inputs.
- the signal is searched using a single fixed codebook search algorithm.
- the method of the present invention adopts a random codebook search algorithm for the unvoiced frame, the depth-first search method for the common frame, and the voiced frame/transition frame adopts the method used in the calculation example of the codebook search algorithm of the present invention.
- 1 The weighted segmentation SNR parameter of the coding method of the embodiment of the invention is increased by about 0.0245 compared with the method of the original encoder;
- the algorithm complexity of the encoding method of the embodiment of the present invention is, in a million operations per second (MOPS: Million Operations Per Second), which is about 0.3185 MOPS lower than that of the original encoder;
- the PESQ (Perceptual Evaluation of Speech Quality) index of the coding method of the embodiment of the present invention has an average of 0.00127 Mean Opinion Scores (MOS: Mean Opinion Score), which is about 10,000 points. Three or so, there is almost no difference.
- MOS Mean Opinion Score
- the coding method of the embodiment of the present invention has certain advantages in reducing complexity and improving system performance as compared with the method in the original encoder.
- the program includes the following steps: acquiring feature parameters of an input signal; The parameter determines a type of the input signal; and obtains a vector to be quantized according to the characteristic parameter; and according to the determined type of the input signal, performs a codebook search on the vector to be quantized by using a corresponding codebook search algorithm, and the program may store
- the storage medium may include: a ROM, a RAM, a magnetic disk or an optical disk, and the like.
Landscapes
- Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Control Of Stepping Motors (AREA)
- Error Detection And Correction (AREA)
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009539594A JP5532304B2 (ja) | 2007-11-05 | 2008-09-16 | 符号化方法、符号化器、および、コンピュータ読み取り可能な媒体 |
AT08800868T ATE533147T1 (de) | 2007-11-05 | 2008-09-16 | Kodierungsverfahren, kodierer und computerlesbares medium |
EP08800868A EP2110808B1 (en) | 2007-11-05 | 2008-09-16 | A coding method, an encoder and a computer readable medium |
US12/481,060 US8600739B2 (en) | 2007-11-05 | 2009-06-09 | Coding method, encoder, and computer readable medium that uses one of multiple codebooks based on a type of input signal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200710165784A CN100578619C (zh) | 2007-11-05 | 2007-11-05 | 编码方法和编码器 |
CN200710165784.3 | 2007-11-05 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/481,060 Continuation US8600739B2 (en) | 2007-11-05 | 2009-06-09 | Coding method, encoder, and computer readable medium that uses one of multiple codebooks based on a type of input signal |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009059513A1 true WO2009059513A1 (fr) | 2009-05-14 |
Family
ID=40113736
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2008/072371 WO2009059513A1 (fr) | 2007-11-05 | 2008-09-16 | Procédé de codage, codeur et support lisible par ordinateur |
Country Status (7)
Country | Link |
---|---|
US (1) | US8600739B2 (zh) |
EP (1) | EP2110808B1 (zh) |
JP (2) | JP5532304B2 (zh) |
KR (1) | KR101211922B1 (zh) |
CN (1) | CN100578619C (zh) |
AT (1) | ATE533147T1 (zh) |
WO (1) | WO2009059513A1 (zh) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070136054A1 (en) * | 2005-12-08 | 2007-06-14 | Hyun Woo Kim | Apparatus and method of searching for fixed codebook in speech codecs based on CELP |
DK2827327T3 (da) | 2007-04-29 | 2020-10-12 | Huawei Tech Co Ltd | Fremgangsmåde til excitationsimpulskodning |
CN100578619C (zh) | 2007-11-05 | 2010-01-06 | 华为技术有限公司 | 编码方法和编码器 |
CN101577551A (zh) | 2009-05-27 | 2009-11-11 | 华为技术有限公司 | 一种生成格型矢量量化码书的方法及装置 |
CN102243876B (zh) * | 2010-05-12 | 2013-08-07 | 华为技术有限公司 | 预测残差信号的量化编码方法及装置 |
CN102299760B (zh) | 2010-06-24 | 2014-03-12 | 华为技术有限公司 | 脉冲编解码方法及脉冲编解码器 |
CN104254886B (zh) * | 2011-12-21 | 2018-08-14 | 华为技术有限公司 | 自适应编码浊音语音的基音周期 |
CN103377653B (zh) * | 2012-04-20 | 2016-03-16 | 展讯通信(上海)有限公司 | 语音编码中代数码表的搜索方法及装置,语音编码方法 |
RU2638734C2 (ru) | 2013-10-18 | 2017-12-15 | Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф. | Кодирование спектральных коэффициентов спектра аудиосигнала |
FR3013496A1 (fr) * | 2013-11-15 | 2015-05-22 | Orange | Transition d'un codage/decodage par transformee vers un codage/decodage predictif |
FR3024581A1 (fr) * | 2014-07-29 | 2016-02-05 | Orange | Determination d'un budget de codage d'une trame de transition lpd/fd |
CN105355194A (zh) * | 2015-10-22 | 2016-02-24 | 百度在线网络技术(北京)有限公司 | 语音合成方法和装置 |
US10878831B2 (en) | 2017-01-12 | 2020-12-29 | Qualcomm Incorporated | Characteristic-based speech codebook selection |
CN108417206A (zh) * | 2018-02-27 | 2018-08-17 | 四川云淞源科技有限公司 | 基于大数据的信息高速处理方法 |
CN117882095A (zh) * | 2021-06-29 | 2024-04-12 | 西门子股份公司 | 方案推荐方法、设备、系统和存储介质 |
CN117789740B (zh) * | 2024-02-23 | 2024-04-19 | 腾讯科技(深圳)有限公司 | 音频数据处理方法、装置、介质、设备及程序产品 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0753841A2 (en) * | 1990-11-02 | 1997-01-15 | Nec Corporation | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits |
JPH09265300A (ja) * | 1996-03-29 | 1997-10-07 | Sony Corp | 音声処理装置および音声処理方法 |
US6631347B1 (en) * | 2002-05-08 | 2003-10-07 | Samsung Electronics Co., Ltd. | Vector quantization and decoding apparatus for speech signals and method thereof |
CN1760975A (zh) | 2005-10-31 | 2006-04-19 | 连展科技(天津)有限公司 | 增强的amr编码器快速固定码本搜索方法 |
CN1766988A (zh) * | 2005-10-31 | 2006-05-03 | 连展科技(天津)有限公司 | 一种新型的快速固定码本搜索方法 |
Family Cites Families (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5202953A (en) | 1987-04-08 | 1993-04-13 | Nec Corporation | Multi-pulse type coding system with correlation calculation by backward-filtering operation for multi-pulse searching |
CA2010830C (en) | 1990-02-23 | 1996-06-25 | Jean-Pierre Adoul | Dynamic codebook for efficient speech coding based on algebraic codes |
US5754976A (en) | 1990-02-23 | 1998-05-19 | Universite De Sherbrooke | Algebraic codebook with signal-selected pulse amplitude/position combinations for fast coding of speech |
US5701392A (en) | 1990-02-23 | 1997-12-23 | Universite De Sherbrooke | Depth-first algebraic-codebook search for fast coding of speech |
US5187745A (en) | 1991-06-27 | 1993-02-16 | Motorola, Inc. | Efficient codebook search for CELP vocoders |
CA2141181A1 (en) | 1994-09-21 | 1996-03-22 | Kimberly-Clark Worldwide, Inc. | Wet-resilient webs |
JPH08179796A (ja) | 1994-12-21 | 1996-07-12 | Sony Corp | 音声符号化方法 |
US5822724A (en) | 1995-06-14 | 1998-10-13 | Nahumi; Dror | Optimized pulse location in codebook searching techniques for speech processing |
US6393391B1 (en) | 1998-04-15 | 2002-05-21 | Nec Corporation | Speech coder for high quality at low bit rates |
JP3144284B2 (ja) * | 1995-11-27 | 2001-03-12 | 日本電気株式会社 | 音声符号化装置 |
JP3299099B2 (ja) * | 1995-12-26 | 2002-07-08 | 日本電気株式会社 | 音声符号化装置 |
US6480822B2 (en) | 1998-08-24 | 2002-11-12 | Conexant Systems, Inc. | Low complexity random codebook structure |
JP3180786B2 (ja) | 1998-11-27 | 2001-06-25 | 日本電気株式会社 | 音声符号化方法及び音声符号化装置 |
JP4173940B2 (ja) * | 1999-03-05 | 2008-10-29 | 松下電器産業株式会社 | 音声符号化装置及び音声符号化方法 |
EP1221694B1 (en) | 1999-09-14 | 2006-07-19 | Fujitsu Limited | Voice encoder/decoder |
US6510407B1 (en) | 1999-10-19 | 2003-01-21 | Atmel Corporation | Method and apparatus for variable rate coding of speech |
CA2327041A1 (en) | 2000-11-22 | 2002-05-22 | Voiceage Corporation | A method for indexing pulse positions and signs in algebraic codebooks for efficient coding of wideband signals |
US7065338B2 (en) * | 2000-11-27 | 2006-06-20 | Nippon Telegraph And Telephone Corporation | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound |
KR100464369B1 (ko) * | 2001-05-23 | 2005-01-03 | 삼성전자주식회사 | 음성 부호화 시스템의 여기 코드북 탐색 방법 |
JP2002349429A (ja) | 2001-05-28 | 2002-12-04 | Toyota Industries Corp | 可変容量型圧縮機及びその製造方法 |
DE10140507A1 (de) | 2001-08-17 | 2003-02-27 | Philips Corp Intellectual Pty | Verfahren für die algebraische Codebook-Suche eines Sprachsignalkodierers |
US7363218B2 (en) | 2002-10-25 | 2008-04-22 | Dilithium Networks Pty. Ltd. | Method and apparatus for fast CELP parameter mapping |
KR100463559B1 (ko) | 2002-11-11 | 2004-12-29 | 한국전자통신연구원 | 대수 코드북을 이용하는 켈프 보코더의 코드북 검색방법 |
KR100463418B1 (ko) * | 2002-11-11 | 2004-12-23 | 한국전자통신연구원 | Celp 음성 부호화기에서 사용되는 가변적인 고정코드북 검색방법 및 장치 |
KR100463419B1 (ko) | 2002-11-11 | 2004-12-23 | 한국전자통신연구원 | 적은 복잡도를 가진 고정 코드북 검색방법 및 장치 |
US7249014B2 (en) | 2003-03-13 | 2007-07-24 | Intel Corporation | Apparatus, methods and articles incorporating a fast algebraic codebook search technique |
KR100556831B1 (ko) | 2003-03-25 | 2006-03-10 | 한국전자통신연구원 | 전역 펄스 교체를 통한 고정 코드북 검색 방법 |
CN1240050C (zh) | 2003-12-03 | 2006-02-01 | 北京首信股份有限公司 | 一种用于语音编码的固定码本快速搜索方法 |
CN1760905A (zh) | 2004-10-16 | 2006-04-19 | 鸿富锦精密工业(深圳)有限公司 | 电子竞标系统及方法 |
KR100795727B1 (ko) | 2005-12-08 | 2008-01-21 | 한국전자통신연구원 | Celp기반의 음성 코더에서 고정 코드북 검색 장치 및방법 |
US20070136054A1 (en) * | 2005-12-08 | 2007-06-14 | Hyun Woo Kim | Apparatus and method of searching for fixed codebook in speech codecs based on CELP |
CN100578619C (zh) | 2007-11-05 | 2010-01-06 | 华为技术有限公司 | 编码方法和编码器 |
JP5242231B2 (ja) | 2008-04-24 | 2013-07-24 | 三菱電機株式会社 | 電位生成回路および液晶表示装置 |
-
2007
- 2007-11-05 CN CN200710165784A patent/CN100578619C/zh active Active
-
2008
- 2008-09-16 WO PCT/CN2008/072371 patent/WO2009059513A1/zh active Application Filing
- 2008-09-16 KR KR1020097012209A patent/KR101211922B1/ko active IP Right Grant
- 2008-09-16 AT AT08800868T patent/ATE533147T1/de active
- 2008-09-16 EP EP08800868A patent/EP2110808B1/en active Active
- 2008-09-16 JP JP2009539594A patent/JP5532304B2/ja active Active
-
2009
- 2009-06-09 US US12/481,060 patent/US8600739B2/en active Active
-
2013
- 2013-02-04 JP JP2013019667A patent/JP2013122612A/ja active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0753841A2 (en) * | 1990-11-02 | 1997-01-15 | Nec Corporation | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits |
JPH09265300A (ja) * | 1996-03-29 | 1997-10-07 | Sony Corp | 音声処理装置および音声処理方法 |
US6631347B1 (en) * | 2002-05-08 | 2003-10-07 | Samsung Electronics Co., Ltd. | Vector quantization and decoding apparatus for speech signals and method thereof |
CN1760975A (zh) | 2005-10-31 | 2006-04-19 | 连展科技(天津)有限公司 | 增强的amr编码器快速固定码本搜索方法 |
CN1766988A (zh) * | 2005-10-31 | 2006-05-03 | 连展科技(天津)有限公司 | 一种新型的快速固定码本搜索方法 |
Also Published As
Publication number | Publication date |
---|---|
JP5532304B2 (ja) | 2014-06-25 |
ATE533147T1 (de) | 2011-11-15 |
KR20090086102A (ko) | 2009-08-10 |
US8600739B2 (en) | 2013-12-03 |
JP2013122612A (ja) | 2013-06-20 |
EP2110808B1 (en) | 2011-11-09 |
US20090248406A1 (en) | 2009-10-01 |
KR101211922B1 (ko) | 2012-12-13 |
CN101303857A (zh) | 2008-11-12 |
EP2110808A1 (en) | 2009-10-21 |
CN100578619C (zh) | 2010-01-06 |
JP2010511901A (ja) | 2010-04-15 |
EP2110808A4 (en) | 2010-01-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009059513A1 (fr) | Procédé de codage, codeur et support lisible par ordinateur | |
JP5264913B2 (ja) | 話声およびオーディオの符号化における、代数符号帳の高速検索のための方法および装置 | |
KR101406113B1 (ko) | 스피치 신호에서 천이 프레임을 코딩하기 위한 방법 및 장치 | |
Giacobello et al. | Sparse linear prediction and its applications to speech processing | |
KR100795727B1 (ko) | Celp기반의 음성 코더에서 고정 코드북 검색 장치 및방법 | |
US6385576B2 (en) | Speech encoding/decoding method using reduced subframe pulse positions having density related to pitch | |
JP5177561B2 (ja) | 認識器重み学習装置および音声認識装置、ならびに、システム | |
KR100556831B1 (ko) | 전역 펄스 교체를 통한 고정 코드북 검색 방법 | |
JP6170172B2 (ja) | 符号化モード決定方法及び該装置、オーディオ符号化方法及び該装置、並びにオーディオ復号化方法及び該装置 | |
JP4970046B2 (ja) | ディジタル信号圧縮のためのコーディングのために用いられるマルチパルス・ディクショナリのインデクス間のトランスコーディング | |
WO2009006819A1 (fr) | Procédé de recherche de livre de code fixe, système de recherche et support lisible par ordinateur | |
CN1271925A (zh) | 用于码激励线性预测语音编码的整形的固定码簿搜索 | |
JP4696418B2 (ja) | 情報検出装置及び方法 | |
Profeta et al. | End-to-end learning for musical instruments classification | |
JP3471889B2 (ja) | 音声符号化方法及び装置 | |
Wang et al. | Tone recognition for continuous mandarin speech with limited training data using selected context‐dependent hidden markov models | |
Li et al. | Accelerating Transducers through Adjacent Token Merging | |
Malard et al. | Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences | |
Hong | Low Latency Streaming Speech Selection | |
Tulensalo | Learning neural discrete representations for speech | |
Purohit et al. | ASR Free End-to-End Spoken Language Understanding using Transformers | |
Jelinek et al. | Excitation Construction for the Robust Low Bit Rate CELP Speech Coder | |
Tsai et al. | Efficient coding translation of GSM and G. 729 speech coders across mobile and IP networks | |
TW200910329A (en) | Stochastic codebook search algorithm with complexity scalability for speech coders | |
Chollet et al. | Excitation Construction for the Robust Low Bit Rate CELP Speech Coder Milan Jelinek¹ Geneviève Baudoin2 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2009539594 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2008800868 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020097012209 Country of ref document: KR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08800868 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |