WO2021249205A1 - 一种声波信号解码的方法及装置 - Google Patents

一种声波信号解码的方法及装置 Download PDF

Info

Publication number
WO2021249205A1
WO2021249205A1 PCT/CN2021/096642 CN2021096642W WO2021249205A1 WO 2021249205 A1 WO2021249205 A1 WO 2021249205A1 CN 2021096642 W CN2021096642 W CN 2021096642W WO 2021249205 A1 WO2021249205 A1 WO 2021249205A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
quantized
unit
amplitude
sound wave
Prior art date
Application number
PCT/CN2021/096642
Other languages
English (en)
French (fr)
Inventor
唐鸿
Original Assignee
北京声连网信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京声连网信息科技有限公司 filed Critical 北京声连网信息科技有限公司
Publication of WO2021249205A1 publication Critical patent/WO2021249205A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Definitions

  • the present invention relates to the technical field of communication coding, in particular to a method and device for decoding acoustic wave signals.
  • Audio quantization compression is an audio compression technology that uses audio quantization processing. Quantization refers to the process of approximating the continuous value of the signal (or a large number of possible discrete values) to a finite number of (or fewer) discrete values, that is, converting the sampled analog signal into a digital signal by rounding
  • audio compression is the application of appropriate digital signal processing technology to the original digital audio signal stream (PCM encoding) to reduce (compress) its bit rate without loss of useful information or negligible loss.
  • PCM encoding digital signal processing technology
  • compression coding where the audio signal may introduce a lot of noise and certain distortion after passing through a codec system.
  • the sound wave signal is a communication signal or identification signal superimposed on the sound wave or audio.
  • the existing sound wave decoding technology is:
  • the main technical problem to be solved by the present invention is to provide a method and device for decoding sound wave information, which can significantly improve the calculation speed of sound wave decoding of an audio quantized compressed data stream by an interpreted language.
  • a technical solution adopted by the present invention is to provide a sound wave signal decoding method.
  • said pair of said energy data Perform inverse discrete cosine transform processing to obtain amplitude data Specifically: using formula (1) for each of the energy data Perform inverse discrete cosine transform processing to obtain amplitude data
  • the sub-magnitude data Select m consecutive amplitude data in time series Performing unit decoding specifically includes: according to the amplitude data Select the k frequencies with the largest amplitude in each time sequence to form a frequency set, and compare the zero-starting bit number of the frequency set in a sequence composed of l preset k-ary frequency sets to determine the corresponding unit data d
  • the value of j where 0 ⁇ d j ⁇ l, and k and l are both preset natural numbers.
  • n*x quantized values are composed of quantized data
  • the inverse discrete cosine transform processing module is used for each of the energy data Perform inverse discrete cosine transform processing to obtain amplitude data
  • the inverse discrete cosine transform processing module is used to apply formula (1) to each of the energy data Perform inverse discrete cosine transform processing to obtain amplitude data
  • the decoding module body includes: a unit decoding sub-module, which is used to obtain data from the amplitude data Select m consecutive amplitude data in time series Perform unit decoding to obtain unit data d 1 , d 2 , d 3 , ... d m corresponding to the m time sequences; where i ⁇ 0, i is the i-th time sequence in the m consecutive time sequences
  • the synthesis and decoding sub-module is used for the unit data d 1 , d 2 , d 3 , ... d m performs synthesis and decoding to obtain sound wave data.
  • the unit decoding sub-module is used for according to the amplitude data Select the k frequencies with the largest amplitude in each time sequence to form a frequency set, and compare the zero-starting bit number of the frequency set in a sequence composed of l preset k-ary frequency sets to determine the corresponding unit data d
  • the value of j where 0 ⁇ d j ⁇ l, and k and l are both preset natural numbers.
  • the sub-magnitude data Select m consecutive amplitude data in time series Performing unit decoding specifically includes: according to the amplitude data Select the k frequencies with the largest amplitude in each time sequence to form a frequency set, and compare the zero-starting bit number of the frequency set in a sequence composed of l preset k-ary frequency sets to determine the corresponding unit data d
  • the value of j where 0 ⁇ d j ⁇ l, and k and l are both preset natural numbers.
  • n*x quantized values are composed of quantized data
  • the inverse discrete cosine transform processing module is used to apply formula (1) to each of the energy data Perform inverse discrete cosine transform processing to obtain amplitude data
  • F y is the energy data predefined by the audio quantized compressed data stream
  • the decoding module includes: a unit decoding sub-module, which is used to retrieve the amplitude data from the Select m consecutive amplitude data in time series Perform unit decoding to obtain unit data d 1 , d 2 , d 3 , ... d m corresponding to the m time sequences; where i ⁇ 0, i is the i-th time sequence in the m consecutive time sequences
  • the unit decoding sub-module is used for according to the amplitude data Select the k frequencies with the largest amplitude in each time sequence to form a frequency set, and compare the zero-starting bit number of the frequency set in a sequence composed of l preset k-ary frequency sets to determine the corresponding unit data d
  • the value of j where 0 ⁇ d j ⁇ l, and k and l are both preset natural numbers.
  • the method and device for decoding sound wave signals provided by the embodiments of the present invention determine the energy data of the audio compressed data stream related to the sound wave signal by using the signal frequency of the sound wave signal, and perform quantization restoration processing on the selected energy data to obtain local quantization restoration.
  • Acoustic energy data block and by performing inverse discrete cosine transform processing on the acoustic energy database, the reordering, anti-aliasing, windowing synthesis filtering, phase correction, and polyphase synthesis filtering operations after quantization restoration are omitted , Thereby reducing the amount of calculation; further, the energy data obtained by the inverse discrete cosine transform processing can be directly decoded by the sound wave, thereby eliminating the Fourier operation that is usually used in the sound wave decoding; thereby reducing the original sound wave signal decoding process The calculation steps and the amount of calculation improve the speed of sound wave decoding by interpretive speech.
  • Fig. 1 is a schematic flow chart of a method for decoding an acoustic wave signal in the first embodiment of the present invention
  • Fig. 2 is a schematic flow chart of a method for decoding an acoustic wave signal in the second embodiment of the present invention
  • Fig. 3 is a schematic structural diagram of a sound wave signal decoding device in an embodiment of the present invention.
  • Fig. 4 is a schematic diagram of the structure of the decoding module in Fig. 3.
  • FIG. 1 is a schematic flowchart of an acoustic wave signal decoding method in an embodiment of the present invention.
  • the method includes:
  • Step S10 Perform real-time decompression processing on the audio quantized compressed data stream to be decoded to generate one or more continuous quantized data blocks Z x .
  • the original audio signal is quantized and compressed to generate the audio quantized compressed data stream.
  • the original audio signal is superimposed on one or more sound wave signals in advance, and each sound wave signal is composed of m unit signals spliced in consecutive m time sequences.
  • Each unit signal is composed of n bit signals superimposed on the same time sequence, and m and n are preset natural numbers.
  • the original audio signal is quantized and compressed through different encoding algorithms to obtain the corresponding audio quantized compressed data stream. Therefore, the audio quantized compressed data stream needs to be decompressed through the corresponding decompression algorithm.
  • the coding algorithm can be AAC (Advanced Audio Coding, Advanced Audio Coding) compression algorithm, MP3 compression algorithm, Huffman compression algorithm, etc.; for example, when the audio quantized compressed data stream is obtained based on the AAC compression algorithm, it is decoded by AAC
  • the compression algorithm performs decompression processing on the audio quantized compressed data stream to obtain quantized data blocks.
  • the original audio signal is compressed by the Huffman compression algorithm to obtain the audio quantized compressed data stream, so the corresponding Huffman decompression algorithm is used to decompress the audio quantized compressed data stream.
  • decompress the audio quantized compressed data stream to be decoded to select one or more continuous compressed data frames from the audio quantized compressed data stream, and decompress each compressed data frame to Generate one or more continuous quantized data blocks Z x .
  • the audio quantized compressed data stream to be decoded is composed of a plurality of consecutive compressed data frames, and each compressed data frame has a predetermined format.
  • one compressed data block can be obtained from multiple compressed data frames according to the encoding algorithm.
  • the parameters of the audio quantized compressed data stream to be decoded are as follows:
  • MP3 MPEG-1 Layer III
  • Each quantized data block contains 576 quantized values, and the frequency intervals represented by the 576 quantized values are successively low to high frequency intervals evenly distributed in the frequency range of 0-22050 Hz.
  • Each quantized data block is composed of multiple quantitative data, and x is determined by the audio compression stream.
  • Each audio quantized compressed data stream is composed of multiple quantized data blocks, so the value of x is not a fixed number, the longer the audio quantized compressed data stream is, the larger x will be.
  • the sound wave signal is composed of 12 unit signals that are respectively spliced in 12 consecutive time sequences, and each unit signal is formed by superimposing 8 bit signals on the same time sequence; among them, these 8 bits
  • n quantized values with the same frequency range as the i-th acoustic signal are selected from each quantized data block.
  • a quantized data block with a frequency adjacent to the frequency of the i-th sound wave signal can also be selected from each quantized data block; wherein the adjacent frequency is predefined as being different from the frequency of the sound wave signal, and the difference value is the smallest. The frequency value.
  • Z 1 and 8 bits select signal corresponding to the frequency of the acoustic signal of the same frequency block 8 from the first quantized data quantized data block, i.e., 470, 474, 483, 488, 497, 501, 510, 515 quantized data blocks
  • the corresponding frequency is 18001.76 Hz
  • the corresponding frequency is 18174.02Hz
  • the corresponding frequency is 18518.55Hz
  • the corresponding frequency is 18690.82Hz
  • the corresponding frequency is 19035.35Hz
  • the corresponding frequency is 19207.62Hz
  • the corresponding frequency is 19552.15Hz
  • the corresponding frequency is 19724.41 Hz.
  • quantify and restore the above 8 quantized data to obtain 8 corresponding energy data which are
  • the quantized data is quantized and restored through step S11 to obtain energy data.
  • the calculation amount of quantized restoration is reduced from the 576 times of global quantization required by the prior art acoustic wave information coding to 8 times of local quantization.
  • Quantized data refers to the data obtained by quantizing the energy of the original signal in the original encoding process of the audio quantized compressed data stream; quantized data cannot be directly used as the energy of the signal, but needs to be quantized and restored. The energy of the signal.
  • step S11 all the quantized data in the quantized data block is not quantized and restored, but only the quantized data corresponding to the same frequency or approximate frequency selected in the quantized data block corresponding to the bit signal of the acoustic signal.
  • the quantization reduction process that is, the method of local quantization reduction, improves the processing speed of quantized data.
  • the quantized data block may also be data in the form of a short block, but the method and principle of local quantization restoration are the same.
  • the quantized data in the form of a short block is also a sequence composed of 576 values, each of which is a group of 3 values in sequence, and the frequency interval represented by each group is in sequence evenly distributed in the frequency range 0-22050Hz. 192 frequency ranges from low to high.
  • Filter according to the 8-bit signal frequency of the acoustic signal, and select the 156th, 158th, 161th, 162th, 165th, 167th, 170th, and 171st groups of the 192 groups of quantized data (starting with zero initial sequence). Perform quantitative reduction to obtain 8 corresponding energy data.
  • the selection of quantized data in the form of short blocks can also be based on the selection method of the approximate frequency interval.
  • formula (1) is used for each energy data Perform inverse discrete cosine transform processing to obtain amplitude data
  • the 8 energy data whose frequencies are the same (or the most similar) to the 8 sound wave signals are respectively analyzed.
  • Chart 1 (Note: The data in the table is for reference only)
  • the blank cell is for indicating and omitting the specific value
  • the amplitude data is the indicating data.
  • Frequency 1, Frequency 2, ... Frequency 8 in Chart 1 respectively represent the frequency of the selected 8 energy data.
  • steps S11 and S12 Through the processing of steps S11 and S12 above, not only the global quantization restoration process is simplified to the local quantization restoration process, but also the unnecessary reordering, anti-aliasing, windowing filtering, phase correction, polyphase synthesis filtering and other calculation steps are avoided. , And further omit the Fourier transform operation that is usually used in sonic decoding. Therefore, the calculation steps and the amount of calculation in the original sound wave signal decoding process are reduced, and therefore, the sound wave signal decoding speed of the interpreted language is greatly improved.
  • the value unit is milliseconds. "Total operation time” is the actual measurement value, and the rest are reference actual measurement values.
  • the prior art A scheme is: restoring the compressed audio data stream to the original digital audio signal stream; after the aforementioned original digital audio signal stream is Fourier transformed, the sound wave signal is decoded to obtain the sound wave signal.
  • the prior art B solution is: restoring the compressed audio data stream into the original digital audio signal stream of the local frequency band; after the aforementioned original digital audio signal stream is Fourier transformed, the sound wave signal is decoded to obtain the sound wave signal.
  • the local frequency band is a frequency band that is the same (or the closest frequency) to the frequency of the acoustic wave signal.
  • the present invention saves a lot of calculation time due to the local quantization reduction, and at the same time, it saves a lot of operation time by eliminating reordering, eliminating aliasing, windowing and filtering, Phase correction, polyphase synthesis filtering, and Fourier transform greatly reduce the amount of calculations done in the entire decoding process, and the total calculation time is correspondingly greatly reduced, which significantly improves the efficiency of decoding calculations.
  • Step S13 for the amplitude data Perform sound wave signal decoding to obtain corresponding sound wave data.
  • step S13 for the amplitude data Decoding the sound wave signal to obtain the corresponding sound wave data is implemented through the following steps:
  • Step S131 from the amplitude data Select m consecutive amplitude data in time series Perform unit decoding to obtain unit data d 1 , d 2 , d 3 ,... D m corresponding to the m time sequences.
  • i 0, i is the i-th sequence in m consecutive sequence
  • the magnitude of the amplitude data of two adjacent frequencies in the same time sequence is compared, and the value of the corresponding bit data is determined according to the signal frequency of the larger amplitude data.
  • amplitude data from the amplitude data Select m consecutive amplitude data in time series Perform unit decoding, specifically including: according to amplitude data Select the k frequencies with the largest amplitude in each time sequence to form a frequency set, and compare the frequency set in a sequence composed of a preset l k-ary frequency sets, with a bit number starting with zero to determine the corresponding unit data
  • d j 0 ⁇ d j ⁇ l
  • k and l are both preset natural numbers.
  • r 0 ⁇ 18.1kHz, 18.2kHz, 18.3kHz, 18.4kHz ⁇ ,
  • r 1 ⁇ 18.1kHz, 18.2kHz, 18.3kHz, 18.5kHz ⁇ ,
  • r 2 ⁇ 18.1kHz, 18.2kHz, 18.3kHz, 18.6kHz ⁇
  • r 3 ⁇ 18.1kHz, 18.2kHz, 18.3kHz, 18.7kHz ⁇ ,
  • r 4 ⁇ 18.1kHz, 18.2kHz, 18.3kHz, 18.8kHz ⁇ ,
  • r 5 ⁇ 18.1kHz, 18.3kHz, 18.5kHz, 18.7kHz ⁇ ,
  • r 7 ⁇ 18.2kHz, 18.3kHz, 18.5kHz, 18.6kHz ⁇ ;
  • the amplitude data in each time sequence is screened and compared as described above to obtain corresponding unit data d 2 , d 3 , d 4 , d 5 , ..., d 12 .
  • step S132 the unit data d 1 , d 2 , d 3 , ... d m are synthesized and decoded to obtain acoustic wave data.
  • the multi-system data composed of d 1 , d 2 , d 3 , ... d m is the acoustic wave data.
  • FIG. 3 is a schematic structural diagram of a sound wave signal decoding apparatus in an embodiment of the present invention.
  • the device 20 includes a decompression processing module 21, a screening module 22, an inverse discrete cosine transform processing module 23, and a decoding module 24.
  • the decompression processing module 21 is configured to perform decompression processing on the audio quantized compressed data stream to be decoded to generate one or more continuous quantized data blocks Z x .
  • the original audio signal is quantized and compressed to generate the audio quantized compressed data stream.
  • the original audio signal is superimposed on one or more sound wave signals in advance, and each sound wave signal is composed of m unit signals spliced in consecutive m time sequences.
  • Each unit signal is composed of n bit signals superimposed on the same time sequence, and m and n are preset natural numbers.
  • the decompression processing module 21 performs decompression processing on the audio quantized compressed data stream to be decoded to select one or more continuous compressed data frames from the audio quantized compressed data stream, and decompress each compressed data frame. Compression processing to generate one or more continuous quantized data blocks Z x .
  • the audio quantized compressed data stream to be decoded is composed of a plurality of consecutive compressed data frames, and each compressed data frame has a predetermined format.
  • the decompression processing module 21 obtains one or more continuous compressed data frames by real-time comparison and verification of the data stream byte by byte according to the encoding algorithm adopted by the audio quantization compression data stream, and compressed according to the audio quantization
  • the encoding algorithm of the data stream obtains the corresponding compressed data block, decompression parameters and quantization restoration parameters from each compressed data frame, and decompresses each compressed data block through the corresponding decompression algorithm to obtain the corresponding quantization Data block Z x .
  • one compressed data block can be obtained from multiple compressed data frames according to the encoding algorithm.
  • the acoustic wave signal is composed of 12 unit signals that are respectively spliced at 12 consecutive time sequences, and each unit signal is formed by superimposing 8 bit signals at the same time sequence.
  • the filtering module 22 selects n quantized values in the same frequency range as the i-th acoustic signal from each quantized data block.
  • the filtering module 22 may also select a quantized data block with a frequency adjacent to the frequency of the i-th acoustic signal from each quantized data block; wherein the adjacent frequency is predefined as being different from the frequency of the acoustic signal, And the frequency value with the smallest difference value.
  • the inverse discrete cosine transform processing module 23 uses formula (1) to analyze each energy data Perform inverse discrete cosine transform processing to obtain amplitude data
  • the decoding module 24 is used to analyze the amplitude data Perform sound wave signal decoding to obtain the corresponding sound wave data.
  • the decoding module 24 specifically includes:
  • the unit decoding sub-module 241 is used to read the amplitude data Select m consecutive amplitude data in time series Perform unit decoding to obtain unit data d 1 , d 2 , d 3 ,... D m corresponding to the m time sequences.
  • i 0, i is the i-th sequence in m consecutive sequence
  • the unit decoding sub-module 241 is based on the amplitude data By comparing the amplitude in each time series and To determine the value of the bit data b k , and combine the determined bit data b 1 , b 3 , b 5 , ..., b k , ..., b n-1 to form a binary number to obtain the corresponding The value of the unit data d j.
  • the unit decoding submodule 241 compares the magnitude of the amplitude data of two adjacent frequencies in the same time sequence, and determines the value of the corresponding bit data according to the signal frequency of the larger amplitude data. .
  • the unit decoding sub-module 241 is based on the amplitude data Select the k frequencies with the largest amplitude in each time sequence to form a frequency set, and compare the frequency set in a sequence composed of a preset l k-ary frequency sets, with a bit number starting with zero to determine the corresponding unit data
  • the value of d j Among them, 0 ⁇ d j ⁇ l, and k and l are both preset natural numbers.
  • the synthesis and decoding sub-module 242 is used to synthesize and decode the unit data d 1 , d 2 , d 3 ,... D m to obtain sound wave data.
  • the synthesis and decoding sub-module 242 is based on the sonic coding algorithm, and the multi-system data composed of d 1 , d 2 , d 3 ,... D m is the sonic data.
  • the method and device for decoding acoustic wave signals provided by the embodiments of the present invention determine the energy data of the audio compressed data stream related to the signal frequency of the acoustic wave signal, and perform quantization restoration processing on the selected energy data to obtain local quantization restoration.
  • Acoustic energy data block and by performing inverse discrete cosine transform processing on the acoustic energy database, the reordering, anti-aliasing, windowing synthesis filtering, phase correction, and polyphase synthesis filtering operations after quantization restoration are omitted , Thereby reducing the amount of calculation; further, the energy data obtained by the inverse discrete cosine transform processing can be directly decoded by the sound wave, thereby eliminating the Fourier operation that is usually used in the sound wave decoding; thereby reducing the original sound wave signal decoding process The calculation steps and the amount of calculation improve the speed of interpretive speech decoding of sound waves.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be other division methods for example, multiple units or components may be Combined or can be integrated into another system, or some features can be ignored or not implemented.
  • the mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, all or part of the technical solution of the present invention can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions to enable a computer device (which can be a personal computer, A management server, or a network device, etc.) or a processor executes all or part of the steps of the method described in each embodiment of the present invention.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (English: read-only memory, abbreviation: ROM), random access memory (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.
  • Various media that can store program codes include: U disk, mobile hard disk, read-only memory (English: read-only memory, abbreviation: ROM), random access memory (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.
  • Various media that can store program codes include: U disk, mobile hard disk, read-only memory (English: read-only memory, abbreviation: ROM), random access memory (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.
  • Various media that can store program codes include: U disk, mobile hard disk, read-only memory (English: read-only memory, abbreviation: ROM), random access memory (English: Random Access Memory, abbreviation: RAM), magnetic disk or optical disk, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

一种声波信息的解码方法及装置,其中,该解码方法包括:对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Zx;从每个量化数据块Zx中分别选取与n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个量化数值组成量化数据(式I),并对每个量化数据(式I)进行量化还原,以获得对应的能量数据(式II);对每个能量数据(式II)进行离散余弦逆变换处理,以获得幅值数据(式III );对幅值数据(式III )进行声波信号解码,以获得对应的声波数据。该解码方法及装置能够显著提高解释型语言对音频量化压缩数据流进行声波解码的运算速度。

Description

一种声波信号解码的方法及装置 【技术领域】
本发明涉及通信编码技术领域,特别涉及一种声波信号解码的方法及装置。
【背景技术】
音频量化压缩是采用音频量化处理的音频压缩技术。量化是指将信号的连续取值(或者大量可能的离散取值)近似为有限多个(或较少的)离散值的过程,即,通过四舍五入的方法将采样后的模拟信号转换成一种数字信号的过程;音频压缩是对原始数字音频信号流(PCM编码)运用适当的数字信号处理技术,在不损失有用信息量,或所引入损失可忽略的条件下,降低(压缩)其码率,也称为压缩编码,其中,音频信号在通过一个编解码系统后可能引入大量的噪声和一定的失真。
声波信号是一种叠加在声波或音频中的通讯信号或标识信号,现有的声波解码技术为:
一、对原始数字音频信号流直接进行声波信号解码以得到声波信号;
二、将音频压缩数据流还原成原始数字音频信号流,并对该原始数字音频信号流进行傅里叶变换,然后对经过傅里叶变换的音频信号进行声波信号解码,以获得声波信号。将经过音频压缩的数据流还原成原始数字音频信号流时,需要经过一系列的复杂运算。
在运用解释型语言(如Python/JavaScript/Perl/Shell等)对音频量化压缩的数据流进行声波解码时,因程序在运行时,要先翻译成中间代码,再由解释器对中间代码进行解释运行,每执行一次都要翻译一次,运算速度较低,耗时较长。
【发明内容】
本发明主要解决的技术问题是提供一种声波信息的解码方法及装置,能够显著提高解释型语言对音频量化压缩数据流进行声波解码的运算速度。
为解决上述技术问题,本发明采用的一个技术方案是:提供一种声波信号解码方法,所述方法包括:对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的 序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
Figure PCTCN2021096642-appb-000001
并对每个所述量化数据
Figure PCTCN2021096642-appb-000002
进行量化还原,以获得对应的能量数据
Figure PCTCN2021096642-appb-000003
其中,y为位元信号的标识号,且,y=1、2、3、……n;对每个所述能量数据
Figure PCTCN2021096642-appb-000004
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000005
其中,z为时序的序号,z=1、2、3、……;对所述幅值数据
Figure PCTCN2021096642-appb-000006
进行声波信号解码,以获得对应的声波数据。
其中,所述对每个所述能量数据
Figure PCTCN2021096642-appb-000007
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000008
具体包括:利用公式(1)对每个所述能量数据
Figure PCTCN2021096642-appb-000009
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000010
Figure PCTCN2021096642-appb-000011
其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
Figure PCTCN2021096642-appb-000012
的编码常量。
其中,所述对所述幅值数据
Figure PCTCN2021096642-appb-000013
进行声波信号解码,以获得对应的声波数据,具体包括:从幅值数据
Figure PCTCN2021096642-appb-000014
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000015
进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
Figure PCTCN2021096642-appb-000016
中的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m;对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
其中,所述从幅值数据
Figure PCTCN2021096642-appb-000017
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000018
进行单元解码,具体包括:根据所述幅值数据
Figure PCTCN2021096642-appb-000019
通过对比各时序中幅值
Figure PCTCN2021096642-appb-000020
Figure PCTCN2021096642-appb-000021
的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……、n-1。
其中,所述从幅值数据
Figure PCTCN2021096642-appb-000022
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000023
进行单元 解码,具体包括:根据所述幅值数据
Figure PCTCN2021096642-appb-000024
选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
为解决上述技术问题,本发明采用的另一个技术方案是:提供一种声波信号解码装置,所述装置包括:解压缩处理模块,用于对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;筛选模块,用于从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
Figure PCTCN2021096642-appb-000025
并对每个所述量化数据
Figure PCTCN2021096642-appb-000026
进行量化还原,以获得对应的能量数据
Figure PCTCN2021096642-appb-000027
其中,y为位元信号的标识号,且,y=1、2、3、……n;离散余弦逆变换处理模块,用于对每个所述能量数据
Figure PCTCN2021096642-appb-000028
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000029
其中,z为时序的序号,z=1、2、3、……;以及解码模块,用于对所述幅值数据
Figure PCTCN2021096642-appb-000030
进行声波信号解码,以获得对应的声波数据。
其中,所述离散余弦逆变换处理模块用于利用公式(1)对每个所述能量数据
Figure PCTCN2021096642-appb-000031
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000032
Figure PCTCN2021096642-appb-000033
其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
Figure PCTCN2021096642-appb-000034
的编码常量。
其中,所述解码模块体包括:单元解码子模块,用于从所述幅值数据
Figure PCTCN2021096642-appb-000035
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000036
进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
Figure PCTCN2021096642-appb-000037
中的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m;合成解码子模块,用于对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
其中,所述单元解码子模块用于根据所述幅值数据
Figure PCTCN2021096642-appb-000038
通过对比各时序中幅值
Figure PCTCN2021096642-appb-000039
Figure PCTCN2021096642-appb-000040
的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……、n-1。
其中,所述单元解码子模块用于根据所述幅值数据
Figure PCTCN2021096642-appb-000041
选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
为解决上述技术问题,本发明采用的另一个技术方案是:提供一种声波信号解码方法,所述方法包括:对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
Figure PCTCN2021096642-appb-000042
并对每个所述量化数据
Figure PCTCN2021096642-appb-000043
进行量化还原,以获得对应的能量数据
Figure PCTCN2021096642-appb-000044
其中,y为位元信号的标识号,且,y=1、2、3、……n;利用公式(1)对每个所述能量数据
Figure PCTCN2021096642-appb-000045
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000046
其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
Figure PCTCN2021096642-appb-000047
的编码常量;z为时序的序号,z=1、2、3、……;
Figure PCTCN2021096642-appb-000048
从幅值数据
Figure PCTCN2021096642-appb-000049
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000050
进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
Figure PCTCN2021096642-appb-000051
中的绝对序号;j为连续m个时序中的相对序号,j=1、2、3、……m;对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
其中,所述从幅值数据
Figure PCTCN2021096642-appb-000052
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000053
进行单元解码,具体包括:根据所述幅值数据
Figure PCTCN2021096642-appb-000054
通过对比各时序中幅值
Figure PCTCN2021096642-appb-000055
Figure PCTCN2021096642-appb-000056
的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……、n-1。
其中,所述从幅值数据
Figure PCTCN2021096642-appb-000057
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000058
进行单元解码,具体包括:根据所述幅值数据
Figure PCTCN2021096642-appb-000059
选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
为解决上述技术问题,本发明采用的另一个技术方案是:提供一种声波信号解码装置,所述装置包括:解压缩处理模块,用于对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;筛选模块,用于从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
Figure PCTCN2021096642-appb-000060
并对每个所述量化数据
Figure PCTCN2021096642-appb-000061
进行量化还原,以获得对应的能量数据
Figure PCTCN2021096642-appb-000062
其中,y为位元信号的标识号,且,y=1、2、3、……n;离散余弦逆变换处理模块,用于利用公式(1)对每个所述能量数据
Figure PCTCN2021096642-appb-000063
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000064
其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
Figure PCTCN2021096642-appb-000065
的编码常量;z为时序的序号,z=1、2、3、……;
Figure PCTCN2021096642-appb-000066
解码模块,包括:单元解码子模块,用于从所述幅值数据
Figure PCTCN2021096642-appb-000067
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000068
进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
Figure PCTCN2021096642-appb-000069
中 的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m;合成解码子模块,用于对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
其中,所述单元解码子模块用于根据所述幅值数据
Figure PCTCN2021096642-appb-000070
通过对比各时序中幅值
Figure PCTCN2021096642-appb-000071
Figure PCTCN2021096642-appb-000072
的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……n-1。
其中,所述单元解码子模块用于根据所述幅值数据
Figure PCTCN2021096642-appb-000073
选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
本发明实施方式提供的一种声波信号解码的方法及装置,通过声波信号的信号频率确定与其相关的音频压缩数据流的能量数据,并针对选取的能量数据进行量化还原处理以得到局部量化还原的声波能量数据块,以及通过对该声波能量数据库进行离散余弦逆变换处理,从而省略了量化还原之后所进行的重新排序、消除混叠、加窗合成滤波、相位修正以及多相合成滤波这些运算过程,从而减少运算量;进一步地,经过离散余弦逆变换处理得到的能量数据可以直接进行声波解码,从而省去声波解码通常都会用到的傅里叶运算;从而减少原有声波信号解码过程中的运算步骤和运算量,提升解释型语音对声波解码的速度。
【附图说明】
图1是本发明第一实施方式中的一种声波信号解码方法的流程示意图;
图2是本发明第二实施方式中的一种声波信号解码方法的流程示意图
图3是本发明实施方式中的一种声波信号解码装置的结构示意图;
图4是图3中解码模块的结构示意图。
【具体实施方式】
为详细说明本发明的技术内容、构造特征、所实现目的及效果,以下结合附图和实施例对本发明进行详细说明。
请参阅图1,为本发明实施方式中的一种声波信号解码方法的流程示意图, 该方法包括:
步骤S10,对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x。其中,x为数据块的序号,且x=1、2、3、……。原始音频信号通过量化处理和压缩处理生成该音频量化压缩数据流,该原始音频信号预先叠加一个或多个声波信号,每个声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数。
通过不同编码算法对原始音频信号进行量化处理和压缩处理以得到对应的音频量化压缩数据流,因此需要通过相应的解压缩算法对该音频量化压缩数据流进行解压缩处理。该编码算法可以是AAC(Advanced Audio Coding,高级音频编码)压缩算法、MP3压缩算法、霍夫曼压缩算法等;例如,当该音频量化压缩数据流是基于AAC压缩算法得到的,则通过AAC解压缩算法对该音频量化压缩数据流进行解压缩处理,以得到量化数据块。在本实施方式中,原始音频信号经过霍夫曼压缩算法进行压缩处理以得到该音频量化压缩数据流,因此利用相应的霍夫曼解压缩算法对该音频量化压缩数据流进行解压缩处理。
具体地,对该待解码的音频量化压缩数据流进行解压缩处理,以从该音频量化压缩数据流中选取一个或多个连续的压缩数据帧,对每个压缩数据帧进行解压缩处理,以生成一个或多个连续的量化数据块Z x。其中,该待解码的音频量化压缩数据流是由连续的多个压缩数据帧组成,每个压缩数据帧都有既定的格式。根据音频量化压缩数据流所采用的编码算法,通过对数据流的逐个字节的实时比对与校验,获得一个或多个连续的压缩数据帧,并根据音频量化压缩数据流的编码算法从每个压缩数据帧中获取对应的压缩数据块、解压缩参数以及量化还原参数,以及通过相应的解压缩算法对每个压缩数据块进行解压缩处理,以得到对应的量化数据块Z x。其中,根据编码算法可以从多个压缩数据帧中获取一个压缩数据块。
例如,该待解码的音频量化压缩数据流的参数如下:
时间长度   13.609秒
采样频率   44100Hz
采样数据   32位(1字节)记录
单声道
格式       MPEG-1 Layer III(MP3)
常数比特率      192kbps
通过步骤S10,对该待解码的音频量化压缩数据流进行解压缩处理,以生成多个量化数据块,即,x=1、2、3、……。每个量化数据块包含576个量化数值,所述576个量化数值所代表的频率区间依次是频率范围0-22050Hz中平均分布的由低到高的频率区间。每个量化数据块是由多个数量化据组成,x是由所述音频压缩流决定的。每个音频量化压缩数据流由多个量化数据块组成,因此x值不是定数,音频量化压缩数据流的时间越长x越大。
在本实施方式中,该声波信号由12个单元信号分别在连续的12个时序上拼接组成,每个单元信号由8个位元信号在同一个时序上叠加而成;其中,这8个位元信号的频率分别是18001.76Hz、18174.02Hz、18518.55Hz、18690.82Hz、19035.35Hz、19207.62Hz、19552.15Hz、19724.41Hz,即,m=12,n=8。
步骤S11,从每个量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
Figure PCTCN2021096642-appb-000074
并对每个所述量化数据
Figure PCTCN2021096642-appb-000075
进行量化还原,以获得对应的能量数据
Figure PCTCN2021096642-appb-000076
其中,y为位元信号的标识号,且,y=1、2、3、……n。
在本实施方式中,从每个量化数据块中选取与第i个声波信号频率范围相同的n个量化数值。在其他实施方式中,还可以从每个量化数据块中选取与第i个声波信号频率相邻频率的量化数据块;其中,相邻频率被预先定义为与声波信号频率不同,且差异值最小的频率值。
由于,对原始音频量化压缩数据流进行音频量化压缩处理时,通过傅里叶变换处理原始音频数据流会产生频谱泄露,因此通过上述基于近似频率选择量化数据块的方法,可以有效弥补频谱泄露而产生的频率偏移。
例如,当x=1,n=8时,从第一个量化数据块Z 1中选取与所述声波信号的8个位元信号频率对应相同频率的8个量化数据块,即,第470、474、483、488、497、501、510、515个量化数据块
Figure PCTCN2021096642-appb-000077
Figure PCTCN2021096642-appb-000078
以组成量化数据
Figure PCTCN2021096642-appb-000079
其中,
Figure PCTCN2021096642-appb-000080
对应的频率为18001.76Hz,
Figure PCTCN2021096642-appb-000081
对应的频率为18174.02Hz,
Figure PCTCN2021096642-appb-000082
对应的频率为18518.55Hz,
Figure PCTCN2021096642-appb-000083
对应的频率为18690.82Hz,
Figure PCTCN2021096642-appb-000084
对应的频率为19035.35Hz,
Figure PCTCN2021096642-appb-000085
对应的频率为19207.62Hz,
Figure PCTCN2021096642-appb-000086
对应的频率为19552.15Hz,
Figure PCTCN2021096642-appb-000087
对应的频率为19724.41Hz。然后,对上述的8个量化数据进行量化还原,以得到8个对应的能 量数据,分别为
Figure PCTCN2021096642-appb-000088
如上所述,通过步骤S11对所述量化数据进行量化还原,以得到能量数据,量化还原的运算量由现有技术的声波信息编码所需的576次的全局量化降为8次的局部量化。
量化数据是指音频量化压缩数据流在原始编码过程中,对原始信号的能量进行量化处理后得到的数据;量化数据并不能直接当作信号的能量,而是需要通过量化还原处理后,才能获得信号的能量。在步骤S11中,并未对量化数据块中的所有量化数据进行量化还原处理,而是仅仅对量化数据块中选取的与声波信号的位元信号对应的相同频率或近似频率对应的量化数据进行量化还原处理,即,通过局部量化还原的方法,提升量化数据的处理速度。
进一步地,根据原始音频量化压缩数据流的编码格式,量化数据块还可以是以一种短块形式存在的数据,但其局部量化还原的方法和原理是一致的。例如,短块形式的量化数据也是一个由576个数值组成的序列,这576个数值依序每3个数值为一组,每组所代表的频率区间依次分别是频率范围0-22050Hz中平均分布的由低到高的192个频率区间。根据声波信号的8个位元信号频率进行筛选,对这个192组量化数据中的(零起始序列开始)的第156、158、161、162、165、167、170、171共8组量化数据进行量化还原,得到8个对应的能量数据。
同样地,对于短块形式的量化数据的选取,也可以基于近似频率区间的选取方法。
步骤S12,对每个能量数据
Figure PCTCN2021096642-appb-000089
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000090
其中,z为时序的序号,z=1、2、3、……。
在本实施方式中,利用公式(1)对每个能量数据
Figure PCTCN2021096642-appb-000091
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000092
Figure PCTCN2021096642-appb-000093
其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
Figure PCTCN2021096642-appb-000094
的编码常量。
例如,对得到的8个能量数据
Figure PCTCN2021096642-appb-000095
分别进行离散余弦逆变换处理,以得到对应的幅值数据。
对能量数据
Figure PCTCN2021096642-appb-000096
进行离散余弦逆变换处理时,即,x=29,y=1,具体地:
当t=0时,根据公式(1)得到,z=505,
Figure PCTCN2021096642-appb-000097
当t=4时,根据公式(1)得到,z=509,
Figure PCTCN2021096642-appb-000098
……
当t=15时,根据公式(1)得到,z=520,
Figure PCTCN2021096642-appb-000099
当t=16时,根据公式(1)得到,z=521,
Figure PCTCN2021096642-appb-000100
当t=17时,根据公式(1)得到,z=522,
Figure PCTCN2021096642-appb-000101
对能量数据
Figure PCTCN2021096642-appb-000102
进行离散余弦逆变换处理时,即,x=29,y=2,具体地:
当t=2时,根据公式(1)得到,z=507,
Figure PCTCN2021096642-appb-000103
当t=3时,根据公式(1)得到,z=508,
Figure PCTCN2021096642-appb-000104
当t=4时,根据公式(1)得到,z=509,
Figure PCTCN2021096642-appb-000105
……
以此类推,分别对与所述8个声波信号频率分别相同(或最相似)的8个能量数据
Figure PCTCN2021096642-appb-000106
分别进行离散余弦逆变换处理,得到这8个能量数据音频在时间轴上分布的18个时间区间的各自频率幅值,如下图表1所示。
Figure PCTCN2021096642-appb-000107
图表1(注:表中数据为仅为示意)
其中,空白单元格是为了示意而省去了具体数值,幅值数据为示意数据。图表1中的频率1、频率2、……频率8分别代表选取的8个能量数据的频率。
在现有技术中,对于音频量化压缩数据流进行的声波信号解码,在进行全局量化还原处理之后,还需要进行重新排序、消除混叠、加窗滤波、相位修正、多相合成滤波以及傅里叶变换这些运算,而其中多相合成滤波的运算尤其耗时。
通过如上步骤S11和步骤S12的处理,不仅将全局量化还原处理简化为局 部量化还原处理,同时避免了不必要的重新排序、消除混叠、加窗滤波、相位修正、多相合成滤波等运算步骤,并进一步省去声波解码通常都会用到的傅里叶变换运算。从而,缩减了原有声波信号解码过程中的运算步骤和运算量,因此极为显著地提升了解释型语言的声波信号解码速度。
例如,基于解释型编程语言JavaScript,结合现有技术的声波信号解码,分别构建三个声波解码方案的计算机仿真程序,每个计算机仿真程序都用于对实验样本(如上举例所述的待解码音频量化压缩数据流)进行连续重复8次的声波解码运算,并记录各个流程步骤的运算时间,具体如下图表所示:
流程步骤 现有技术A方案 现有技术B方案 本发明
1.霍夫曼解码 341.79 342.74 325.24
2.量化还原 397.75 48.88 27.85
3.重排序 1.07 0.39 -
4.消混叠 18.82 6.94 -
5.加窗混合滤波 539.55 87.35 -
6.相位修正 7.28 5.17 -
7.多相合成滤波 1414.28 857.35 -
8.傅里叶变换 389.93 387.70 -
9.声码解码 15.10 15.63 36.02
总运算时间 2895.18 1522.68 385.40
图表2
备注:数值单位为毫秒。“总运算时间”为实测值,其余为参考实测值。
现有技术A方案为:将音频压缩数据流还原成原始数字音频信号流;对前述的原始数字音频信号流,通过傅里叶变换之后,进行声波信号的解码,取得声波信号。
现有技术B方案为:将音频压缩数据流还原成局部频带的原始数字音频信号流;对前述的原始数字音频信号流,通过傅里叶变换之后,进行声波信号的解码,取得声波信号。其中,所述局部频带为与声波信号频率相同(或最近似频率)的频带。
经过如上所述的3个计算机仿真程序进行声波信号解码的模拟运算,可以明显知道,本发明由于局部量化还原而节省了大量运算时间,同时通过省去重新排序、消除混叠、加窗滤波、相位修正、多相合成滤波以及傅里叶变换,而使得整个解码过程所做的运算量大幅减少,总运算时间也相应地得到大幅的减少,明显提高了解码运算效率。
步骤S13,对所述幅值数据
Figure PCTCN2021096642-appb-000108
进行声波信号解码,以获得对应的声波数据。
请同时参阅图2,步骤S13,对幅值数据
Figure PCTCN2021096642-appb-000109
进行声波信号解码,以获得对应的声波数据,具体通过如下步骤实现:
步骤S131,从幅值数据
Figure PCTCN2021096642-appb-000110
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000111
进行单元解码,以得到与所述m个时序对应的单元数据d 1、d 2、d 3、……d m。其中,i≥0,i为连续m个时序中的第i个时序在
Figure PCTCN2021096642-appb-000112
中的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m。
在本实施方式中,从幅值数据
Figure PCTCN2021096642-appb-000113
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000114
进行单元解码,具体包括:根据幅值数据
Figure PCTCN2021096642-appb-000115
通过对比各时序中幅值
Figure PCTCN2021096642-appb-000116
Figure PCTCN2021096642-appb-000117
的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值。其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……n-1。在本实施方式中,通过对比同一时序的两个相邻频率的幅值数据的大小,并根据较大的幅值数据的信号频率,以确定对应的位元数据的数值。
根据如上所述得到的幅值数据,比对每个时序中相邻两个幅值数据的大小,例如,当时序z=1时,确定该时序中的幅值数据包含
Figure PCTCN2021096642-appb-000118
通过比对相邻的两个幅值数据
Figure PCTCN2021096642-appb-000119
的大小,以确定位元数据b 1的数值;通过比对幅值数据
Figure PCTCN2021096642-appb-000120
的大小,以确定位元数据b 3的数值;通过比对幅值数据
Figure PCTCN2021096642-appb-000121
Figure PCTCN2021096642-appb-000122
的大小,以确定位元数据b 5的数值;通过比对幅值数据
Figure PCTCN2021096642-appb-000123
的大小,以确定位元数据b 7的数值;然后,将确定的位元数据b 1、b 3、b 5、b 7组成二进制数,以获得对应的单元数据d 1。同理,获得后续时序所对应的单元数据d 2、d 3、d 4、d 5、……、d 12
在另一实施方式中,从幅值数据
Figure PCTCN2021096642-appb-000124
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000125
进行单元解码,具体包括:根据幅值数据
Figure PCTCN2021096642-appb-000126
选取各时序中幅值最大的k个频率组成频率集合,通过比对该频率集合在预设的l个k元频率集合组成的序列中,以零起始的位序号,以确定对应的单元数据d j的数值。其中,0≤d j≤l,k、l均为预设自然数。
例如,当z=1时,确定该时序中的幅值数据包含
Figure PCTCN2021096642-appb-000127
从该时序中选择幅值最大的4个幅值数据,并将这4个幅值数据对应的频率组成频 率集合,如,a 0={18.1kHz,18.2kHz,18.3kHz,18.7kHz},即,k=4;
预设8个4元频率集合如下,即,k=4,l=8:
r 0={18.1kHz,18.2kHz,18.3kHz,18.4kHz},
r 1={18.1kHz,18.2kHz,18.3kHz,18.5kHz},
r 2={18.1kHz,18.2kHz,18.3kHz,18.6kHz},
r 3={18.1kHz,18.2kHz,18.3kHz,18.7kHz},
r 4={18.1kHz,18.2kHz,18.3kHz,18.8kHz},
r 5={18.1kHz,18.3kHz,18.5kHz,18.7kHz},
r 6={18.2kHz,18.4kHz,18.6kHz,18.8kHz},
r 7={18.2kHz,18.3kHz,18.5kHz,18.6kHz};
上述8个预设频率集合组成的序列中,通过将筛选出来的频率集合a 0与上述8个预设频率集合进行对比,从而确定该频率集合a 0位于上述8个预设频率集合组成的序列中位序号为3,即,a 0=r 3,因此,对应的单元数据d 1的数值为3。
同理,对每个时序中的幅值数据进行如上所述的筛选和比对,以获得对应的单元数据d 2、d 3、d 4、d 5、……、d 12
步骤S132,对该单元数据d 1、d 2、d 3、……d m进行合成解码,以得到声波数据。
具体地,根据声波编码算法,由d 1、d 2、d 3、……d m所组成的多进制数据,即为声波数据。
请参阅图3,为本发明实施方式中的一种声波信号的解码装置的结构示意图。该装置20包括解压缩处理模块21、筛选模块22、离散余弦逆变换处理模块23以及解码模块24。
该解压缩处理模块21用于对待解码的音频量化压缩数据流进行解压缩处理,以生成一个或多个连续的量化数据块Z x。其中,x为数据块的序号,且x=1、2、3、……。原始音频信号通过量化处理和压缩处理生成该音频量化压缩数据流,该原始音频信号预先叠加一个或多个声波信号,每个声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数。
具体地,该解压缩处理模块21对待解码的音频量化压缩数据流进行解压缩处理,以从该音频量化压缩数据流中选取一个或多个连续的压缩数据帧,对每个压缩数据帧进行解压缩处理,以生成一个或多个连续的量化数据块Z x。其中, 该待解码的音频量化压缩数据流是由连续的多个压缩数据帧组成,每个压缩数据帧都有既定的格式。
该解压缩处理模块21根据音频量化压缩数据流所采用的编码算法,通过对数据流的逐个字节的实时比对与校验,获得一个或多个连续的压缩数据帧,并根据音频量化压缩数据流的编码算法从每个压缩数据帧中获取对应的压缩数据块、解压缩参数以及量化还原参数,以及通过相应的解压缩算法对每个压缩数据块进行解压缩处理,以得到对应的量化数据块Z x。其中,根据编码算法可以从多个压缩数据帧中获取一个压缩数据块。
在本实施方式中,该声波信号由12个单元信号分别在连续的12个时序上拼接组成,每个单元信号由8个位元信号在同一个时序上叠加而成。
该筛选模块22用于从每个量化数据块Z x中分别选取与n个位元信号频率对应相同或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
Figure PCTCN2021096642-appb-000128
并对每个量化数据
Figure PCTCN2021096642-appb-000129
进行量化还原,以获得对应的能量数据
Figure PCTCN2021096642-appb-000130
其中,y为位元信号的标识号,且,y=1、2、3、……n。
在本实施方式中,该筛选模块22从每个量化数据块中选取与第i个声波信号频率范围相同的n个量化数值。在其他实施方式中,该筛选模块22还可以从每个量化数据块中选取与第i个声波信号频率相邻频率的量化数据块;其中,相邻频率被预先定义为与声波信号频率不同,且差异值最小的频率值。
该离散余弦逆变换处理模块23用于对每个能量数据
Figure PCTCN2021096642-appb-000131
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000132
其中,z为时序的序号,z=1、2、3、……。
具体地,该离散余弦逆变换处理模块23利用公式(1)对每个能量数据
Figure PCTCN2021096642-appb-000133
进行离散余弦逆变换处理,以获得幅值数据
Figure PCTCN2021096642-appb-000134
Figure PCTCN2021096642-appb-000135
其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
Figure PCTCN2021096642-appb-000136
的编码常量。
该解码模块24用于对所述幅值数据
Figure PCTCN2021096642-appb-000137
进行声波信号解码,以获得对应的声波数据。
请同时参阅图4,该解码模块24具体包括:
单元解码子模块241,用于从幅值数据
Figure PCTCN2021096642-appb-000138
中选取连续m个时序的幅值数据
Figure PCTCN2021096642-appb-000139
进行单元解码,以得到与所述m个时序对应的单元数据d 1、d 2、d 3、……d m。其中,i≥0,i为连续m个时序中的第i个时序在
Figure PCTCN2021096642-appb-000140
中的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m。
在本实施方式中,该单元解码子模块241根据幅值数据
Figure PCTCN2021096642-appb-000141
通过对比各时序中幅值
Figure PCTCN2021096642-appb-000142
Figure PCTCN2021096642-appb-000143
的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值。其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……n-1。
在本实施方式中,该单元解码子模块241通过对比同一时序的两个相邻频率的幅值数据的大小,并根据较大的幅值数据的信号频率,以确定对应的位元数据的数值。
在另一实施方式中,该单元解码子模块241根据幅值数据
Figure PCTCN2021096642-appb-000144
选取各时序中幅值最大的k个频率组成频率集合,通过比对该频率集合在预设的l个k元频率集合组成的序列中,以零起始的位序号,以确定对应的单元数据d j的数值。其中,0≤d j≤l,k、l均为预设自然数。
合成解码子模块242,用于对该单元数据d 1、d 2、d 3、……d m进行合成解码,以得到声波数据。
具体地,该合成解码子模块242根据声波编码算法,由d 1、d 2、d 3、……d m所组成的多进制数据,即为声波数据。
本发明实施方式提供的一种声波信号解码的方法及装置,通过声波信号的信号频率确定与其相关的音频压缩数据流的能量数据,并针对选取的能量数据进行量化还原处理以得到局部量化还原的声波能量数据块,以及通过对该声波能量数据库进行离散余弦逆变换处理,从而省略了量化还原之后所进行的重新排序、消除混叠、加窗合成滤波、相位修正以及多相合成滤波这些运算过程,从而减少运算量;进一步地,经过离散余弦逆变换处理得到的能量数据可以直接进行声波解码,从而省去声波解码通常都会用到的傅里叶运算;从而减少原有声波信号解码过程中的运算步骤和运算量,提升解释型语音对声波解码的速度。
在本发明所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性 的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,管理服务器,或者网络设备等)或处理器执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(英文:read-only memory,缩写:ROM)、随机存取存储器(英文:Random Access Memory,缩写:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本发明的实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (16)

  1. 一种声波信号解码方法,其特征在于,所述方法包括:
    对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;
    从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
    Figure PCTCN2021096642-appb-100001
    并对每个所述量化数据
    Figure PCTCN2021096642-appb-100002
    进行量化还原,以获得对应的能量数据
    Figure PCTCN2021096642-appb-100003
    其中,y为位元信号的标识号,且,y=1、2、3、……n;
    对每个所述能量数据
    Figure PCTCN2021096642-appb-100004
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100005
    其中,z为时序的序号,z=1、2、3、……;
    对所述幅值数据
    Figure PCTCN2021096642-appb-100006
    进行声波信号解码,以获得对应的声波数据。
  2. 根据权利要求1所述的声波信号解码方法,其特征在于,所述对每个所述能量数据
    Figure PCTCN2021096642-appb-100007
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100008
    具体包括:
    利用公式(1)对每个所述能量数据
    Figure PCTCN2021096642-appb-100009
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100010
    Figure PCTCN2021096642-appb-100011
    其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
    Figure PCTCN2021096642-appb-100012
    的编码常量。
  3. 根据权利要求1所述的声波信号解码方法,其特征在于,所述对所述幅值数据
    Figure PCTCN2021096642-appb-100013
    进行声波信号解码,以获得对应的声波数据,具体包括:
    从幅值数据
    Figure PCTCN2021096642-appb-100014
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100015
    进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
    Figure PCTCN2021096642-appb-100016
    中的绝对序号;j为连续m个时序中的相对序号,j=1、2、3、……m;
    对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
  4. 根据权利要求3所述的声波信号解码方法,其特征在于,所述从幅值数据
    Figure PCTCN2021096642-appb-100017
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100018
    进行单元解码,具体包括:
    根据所述幅值数据
    Figure PCTCN2021096642-appb-100019
    通过对比各时序中幅值
    Figure PCTCN2021096642-appb-100020
    Figure PCTCN2021096642-appb-100021
    的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……、n-1。
  5. 根据权利要求3所述的声波信号解码方法,其特征在于,所述从幅值数据
    Figure PCTCN2021096642-appb-100022
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100023
    进行单元解码,具体包括:
    根据所述幅值数据
    Figure PCTCN2021096642-appb-100024
    选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
  6. 一种声波信号解码装置,其特征在于,所述装置包括:
    解压缩处理模块,用于对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;
    筛选模块,用于从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
    Figure PCTCN2021096642-appb-100025
    并对每个所述量化数据
    Figure PCTCN2021096642-appb-100026
    进行量化还原,以获得对应的能量数据
    Figure PCTCN2021096642-appb-100027
    其中,y为位元信号的标识号,且,y=1、2、3、……n;
    离散余弦逆变换处理模块,用于对每个所述能量数据
    Figure PCTCN2021096642-appb-100028
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100029
    其中,z为时序的序号,z=1、2、3、……;以及
    解码模块,用于对所述幅值数据
    Figure PCTCN2021096642-appb-100030
    进行声波信号解码,以获得对应的声波数据。
  7. 根据权利要求6所述的声波信号解码装置,其特征在于,所述离散余弦逆变换处理模块用于利用公式(1)对每个所述能量数据
    Figure PCTCN2021096642-appb-100031
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100032
    Figure PCTCN2021096642-appb-100033
    其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
    Figure PCTCN2021096642-appb-100034
    的编码常量。
  8. 根据权利要求7所述的声波信号解码装置,其特征在于,所述解码模块体包括:
    单元解码子模块,用于从所述幅值数据
    Figure PCTCN2021096642-appb-100035
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100036
    进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
    Figure PCTCN2021096642-appb-100037
    中的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m;
    合成解码子模块,用于对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
  9. 根据权利要求8所述的声波信号解码装置,其特征在于,所述单元解码子模块用于根据所述幅值数据
    Figure PCTCN2021096642-appb-100038
    通过对比各时序中幅值
    Figure PCTCN2021096642-appb-100039
    Figure PCTCN2021096642-appb-100040
    的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……n-1。
  10. 根据权利要求8所述的声波信号解码装置,其特征在于,所述单元解码子模块用于根据所述幅值数据
    Figure PCTCN2021096642-appb-100041
    选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
  11. 一种声波信号解码方法,其特征在于,所述方法包括:
    对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元信号在同一个时序上叠加组成,m、n均为预设的自然数;
    从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频 率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
    Figure PCTCN2021096642-appb-100042
    并对每个所述量化数据
    Figure PCTCN2021096642-appb-100043
    进行量化还原,以获得对应的能量数据
    Figure PCTCN2021096642-appb-100044
    其中,y为位元信号的标识号,且,y=1、2、3、……n;
    利用公式(1)对每个所述能量数据
    Figure PCTCN2021096642-appb-100045
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100046
    其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
    Figure PCTCN2021096642-appb-100047
    的编码常量;z为时序的序号,z=1、2、3、……;
    Figure PCTCN2021096642-appb-100048
    从幅值数据
    Figure PCTCN2021096642-appb-100049
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100050
    进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
    Figure PCTCN2021096642-appb-100051
    中的绝对序号;j为连续m个时序中的相对序号,j=1、2、3、……m;
    对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
  12. 根据权利要求11所述的声波信号解码方法,其特征在于,所述从幅值数据
    Figure PCTCN2021096642-appb-100052
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100053
    进行单元解码,具体包括:
    根据所述幅值数据
    Figure PCTCN2021096642-appb-100054
    通过对比各时序中幅值
    Figure PCTCN2021096642-appb-100055
    Figure PCTCN2021096642-appb-100056
    的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……、n-1。
  13. 根据权利要求11所述的声波信号解码方法,其特征在于,所述从幅值数据
    Figure PCTCN2021096642-appb-100057
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100058
    进行单元解码,具体包括:
    根据所述幅值数据
    Figure PCTCN2021096642-appb-100059
    选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
  14. 一种声波信号解码装置,其特征在于,所述装置包括:
    解压缩处理模块,用于对待解码的音频量化压缩数据流进行实时解压缩处理,以生成一个或多个连续的量化数据块Z x;其中,x为数据块的序号,x=1、2、3、……;原始音频信号通过量化处理和压缩处理以生成所述音频量化压缩数据流,所述原始音频信号预先叠加一个或多个声波信号,每个所述声波信号由m个单元信号分别在连续的m个时序上拼接组成,每个所述单元信号由n个位元 信号在同一个时序上叠加组成,m、n均为预设的自然数;
    筛选模块,用于从每个所述量化数据块Z x中分别选取与所述n个位元信号频率对应相同频率或最近似频率的n个量化数值,将获得的n*x个所述量化数值组成量化数据
    Figure PCTCN2021096642-appb-100060
    并对每个所述量化数据
    Figure PCTCN2021096642-appb-100061
    进行量化还原,以获得对应的能量数据
    Figure PCTCN2021096642-appb-100062
    其中,y为位元信号的标识号,且,y=1、2、3、……n;
    离散余弦逆变换处理模块,用于利用公式(1)对每个所述能量数据
    Figure PCTCN2021096642-appb-100063
    进行离散余弦逆变换处理,以获得幅值数据
    Figure PCTCN2021096642-appb-100064
    其中,T=18,t=0、1、2、……T-1,z=T(x-1)+t+1,F y为所述音频量化压缩数据流所预定义的能量数据
    Figure PCTCN2021096642-appb-100065
    的编码常量;z为时序的序号,z=1、2、3、……;
    Figure PCTCN2021096642-appb-100066
    解码模块,包括:
    单元解码子模块,用于从所述幅值数据
    Figure PCTCN2021096642-appb-100067
    中选取连续m个时序的幅值数据
    Figure PCTCN2021096642-appb-100068
    进行单元解码,以获得与所述m个时序对应的单元数据d 1、d 2、d 3、……d m;其中,i≥0,i为连续m个时序中的第i个时序在
    Figure PCTCN2021096642-appb-100069
    中的绝对序号;j为连续m个时序的相对序号,j=1、2、3、……m;
    合成解码子模块,用于对所述单元数据d 1、d 2、d 3、……d m进行合成解码,以获得声波数据。
  15. 根据权利要求14所述的声波信号解码装置,其特征在于,所述单元解码子模块用于根据所述幅值数据
    Figure PCTCN2021096642-appb-100070
    通过对比各时序中幅值
    Figure PCTCN2021096642-appb-100071
    Figure PCTCN2021096642-appb-100072
    的大小,以确定位元数据b k的数值,并将确定的位元数据b 1、b 3、b 5、……、b k、……、b n-1组成二进制数,以获得对应的单元数据d j的数值;其中,k为一个单元信号所包含的n个位元信号的序号,k=1、3、5、……n-1。
  16. 根据权利要求14所述的声波信号解码装置,其特征在于,所述单元解码子模块用于根据所述幅值数据
    Figure PCTCN2021096642-appb-100073
    选取各时序中幅值最大的k个频率组成频率集合,通过比对所述频率集合在预设的l个k元频率集合组成的序列中零起始的位序号,以确定对应的单元数据d j的数值;其中,0≤d j≤l,k、l均为预设自然数。
PCT/CN2021/096642 2020-05-30 2021-05-28 一种声波信号解码的方法及装置 WO2021249205A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010480109.5 2020-05-30
CN202010480109.5A CN111816196A (zh) 2020-05-30 2020-05-30 一种声波信息的解码方法及装置

Publications (1)

Publication Number Publication Date
WO2021249205A1 true WO2021249205A1 (zh) 2021-12-16

Family

ID=72848436

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096642 WO2021249205A1 (zh) 2020-05-30 2021-05-28 一种声波信号解码的方法及装置

Country Status (2)

Country Link
CN (1) CN111816196A (zh)
WO (1) WO2021249205A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111816196A (zh) * 2020-05-30 2020-10-23 北京声连网信息科技有限公司 一种声波信息的解码方法及装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5161210A (en) * 1988-11-10 1992-11-03 U.S. Philips Corporation Coder for incorporating an auxiliary information signal in a digital audio signal, decoder for recovering such signals from the combined signal, and record carrier having such combined signal recorded thereon
US5319735A (en) * 1991-12-17 1994-06-07 Bolt Beranek And Newman Inc. Embedded signalling
US5687191A (en) * 1995-12-06 1997-11-11 Solana Technology Development Corporation Post-compression hidden data transport
US20070297455A1 (en) * 1998-07-29 2007-12-27 British Broadcasting Corporation Inserting auxiliary data in a main data stream
CN104700840A (zh) * 2013-12-04 2015-06-10 Vixs系统公司 用于音频编码/解码/转码的频率域中水印插入
CN108964786A (zh) * 2018-06-13 2018-12-07 厦门声连网信息科技有限公司 一种声波信号编码、解码的方法及装置
CN110011760A (zh) * 2019-04-10 2019-07-12 中山大学 一种基于声波的全双工多载波近场通信方法
CN111816196A (zh) * 2020-05-30 2020-10-23 北京声连网信息科技有限公司 一种声波信息的解码方法及装置

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5161210A (en) * 1988-11-10 1992-11-03 U.S. Philips Corporation Coder for incorporating an auxiliary information signal in a digital audio signal, decoder for recovering such signals from the combined signal, and record carrier having such combined signal recorded thereon
US5319735A (en) * 1991-12-17 1994-06-07 Bolt Beranek And Newman Inc. Embedded signalling
US5687191A (en) * 1995-12-06 1997-11-11 Solana Technology Development Corporation Post-compression hidden data transport
US20070297455A1 (en) * 1998-07-29 2007-12-27 British Broadcasting Corporation Inserting auxiliary data in a main data stream
CN104700840A (zh) * 2013-12-04 2015-06-10 Vixs系统公司 用于音频编码/解码/转码的频率域中水印插入
CN108964786A (zh) * 2018-06-13 2018-12-07 厦门声连网信息科技有限公司 一种声波信号编码、解码的方法及装置
CN110011760A (zh) * 2019-04-10 2019-07-12 中山大学 一种基于声波的全双工多载波近场通信方法
CN111816196A (zh) * 2020-05-30 2020-10-23 北京声连网信息科技有限公司 一种声波信息的解码方法及装置

Also Published As

Publication number Publication date
CN111816196A (zh) 2020-10-23

Similar Documents

Publication Publication Date Title
CN112735447B (zh) 压缩和解压缩高阶高保真度立体声响复制信号表示的方法及装置
US8255231B2 (en) Encoding and decoding of audio signals using complex-valued filter banks
TWI555009B (zh) 使用適應性頻譜平鋪選擇進行解碼及編碼的裝置及方法
EP2036201B1 (en) Filter unit and method for generating subband filter impulse responses
US8687818B2 (en) Method for dynamically adjusting the spectral content of an audio signal
US6965859B2 (en) Method and apparatus for audio compression
US6430529B1 (en) System and method for efficient time-domain aliasing cancellation
MXPA06000528A (es) Aparato y metodo para conversion en una representacion transformada o para la conversion inversa de la representacion transformada.
WO2021249205A1 (zh) 一种声波信号解码的方法及装置
WO2009125588A1 (ja) 符号化装置および符号化方法
WO2015096789A1 (zh) 一种用于音频信号的矢量量化编解码方法及装置
EP2595147B1 (en) Audio data encoding method and device
US7783488B2 (en) Remote tracing and debugging of automatic speech recognition servers by speech reconstruction from cepstra and pitch information
KR20060036724A (ko) 오디오 신호 부호화 및 복호화 방법 및 그 장치
JP2023523763A (ja) ダイナミックレンジ低減領域においてマルチチャネルオーディオを強調するための方法、装置、及びシステム
WO2021244409A1 (zh) 一种声波信号解码的方法及装置
JP2776300B2 (ja) 音声信号処理回路
JP6094322B2 (ja) 直交変換装置、直交変換方法及び直交変換用コンピュータプログラムならびにオーディオ復号装置
JPH09127985A (ja) 信号符号化方法及び装置
JPH09127987A (ja) 信号符号化方法及び装置
US6882976B1 (en) Efficient finite length POW10 calculation for MPEG audio encoding
JP2022505789A (ja) サブバンド併合および時間領域エイリアシング低減を使用した適応的な非均一時間/周波数タイリングによる知覚音声符号化
US20170206905A1 (en) Method, medium and apparatus for encoding and/or decoding signal based on a psychoacoustic model
JP2002182695A (ja) 高能率符号化方法及び装置
JPH09127994A (ja) 信号符号化方法及び装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21821050

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21821050

Country of ref document: EP

Kind code of ref document: A1