CN1536557A - DAC voice data compression and uncompression technique - Google Patents

DAC voice data compression and uncompression technique Download PDF

Info

Publication number
CN1536557A
CN1536557A CNA031093183A CN03109318A CN1536557A CN 1536557 A CN1536557 A CN 1536557A CN A031093183 A CNA031093183 A CN A031093183A CN 03109318 A CN03109318 A CN 03109318A CN 1536557 A CN1536557 A CN 1536557A
Authority
CN
China
Prior art keywords
data
dac
mantissa
file
compression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA031093183A
Other languages
Chinese (zh)
Inventor
梁肇新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CNA031093183A priority Critical patent/CN1536557A/en
Publication of CN1536557A publication Critical patent/CN1536557A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention disclosed a DAC audio data compressing/decompressing technique, adopting multichannel independent coding mode to sample data, able to sample 8K-1MHz data, supporting up-to-32-channel multichannel processing mode, and settling the limit of the existing technique to audio data sampling range and channel processing mode and amount; switches the data from time domain to frequency domain, makes human ear sensitivity analysis on the data based on natural acoustic model, and adopts index and mantissa coding method to make fuzzy processing on the data, enhancing tone quality and effect, reaching above 95% above of the effect of primary sound; for the transformed data, uses data bit distribution and adopts nonintegral bit compressing method, enhancing data compression ratio; combines the compressed modules to obtain DAC file or DAC audio data flow, and then reads in data frames from the DAC file to realize decompression and play. It can be applied to the multimedia devices such as computer, VCD player etc.

Description

Compression of DAC voice data and decompression technique
1. technical field
Present technique belongs to the high quality audio processing technology field, refers in particular to realize high-quality sampling compression and arbitrarily hyperchannel compression and the technology that decompresses.
2. technical background
The technical indicator of decision audio compression and decompression quality mainly contains: sample frequency, sampling resolution (sampling precision), channel number (port number).Existing audio compression and decompression technique utilize people's psychological auditory model, voice data is sampled to the frequency between the 44.1khz with 20hz, sampling resolution is 8 or 16, from 1 to 8 sound channel of sampling sound channel, can only handle single channel (passage is exactly a sound channel), two passages, four-way and six passages, and every kind of passage processing mode is all inconsistent, makes it can not handle multilingual playback problem, can not the disposable hyperchannel.In order to reach the purpose of data compression, prior art according to people's ear to the insensitive characteristics of high-frequency sound, thereby these insensitive data are lost, cause losing of high-frequency data, can not sample to the frequency data more than the 48khz, this has just directly limited the effect of tonequality.
3. summary of the invention
Be restricted in order to overcome existing audio compression and decompression technique sample frequency and treatment channel, play the relatively poor problem of tonequality, improve the voice playing quality, patent of the present invention provides DAC compression and decompression technique (DAC is that the English Digital Audio Compres of digital audio compression is called for short), it is supported in sampling compression in the optional frequency between the 8K-1MHZ, support the nearly hyperchannel processing mode of 32 passages, make comparisons with identical bit rate 200K-300K (stereo), the acoustical quality of patent of the present invention can reach the former sound effective value of CD more than 95%, and prior art can only reach the former sound effective value of CD below 91%.
For solving above technical matters, the technical scheme that patent of the present invention provides is:
In the audio data collecting compression process, encode based on the natural model of sound, sample in the optional frequency scope between 8K-1MHZ, there are not frequency data to lose.The data after gathering when time domain is transformed into frequency domain, employing be index mantissa Coded Analysis method.Index can be determined clear and fuzzy sound, and mantissa is big more, and then sound is accurate more, and this coding has guaranteed that the high-frequency sound data can not lose, and adopts the method for fuzzy mantissa to limit the data volume behind the coding simultaneously.Patent of the present invention has been widened the scope and the dirigibility of compression frequency, can compress in the 8K-1MHZ scope, does not need to specify the general now frequency of using, as 8K, and 11.25,22.5K, 32K, 44.1K, 96K (DVD); Each passage absolute coding so can define a plurality of passages, can be supported nearly 32 passages.
After the voice data compression, the decompression playing process that carries out:
At first read in a DAC Frame, be divided into the plurality of data group, again data set is divided into data block according to the Frame sign from file.Then, come the decompressed data piece by looking into data distribution list.Behind the decompress(ion), data are reduced successively, obtain final frequency domain data by reduction.Carry out the conversion of frequency domain data at last, obtained the voice data of final broadcast to time domain.
The beneficial effect of patent of the present invention is: (1) compression of can the optional frequency between 8K-1MHZ sampling, improved acoustical quality greatly, and can reach the CD primary sound that is equivalent to more than 95%.(2) can support the nearly sound of 32 passages, can force together, not need a plurality of audio streams, reduce system overhead multilingual.(3) method that adopts non-integral bit to compress under the prerequisite that does not influence acoustical quality, has improved data compression rate greatly.
4. description of drawings
Fig. 1 has illustrated the compression overall process of patent of the present invention.
Fig. 2 has illustrated the process that patent channel data of the present invention is gathered.
How Fig. 3 explanation is converted into time domain data the process of frequency domain data.
Fig. 4 illustrates the process of frequency domain data conversion.
Fig. 5 illustrates the detailed process of data compression.
The process that Fig. 6 explanation is merged into data block Frame and preserved.
Fig. 7 has illustrated the overall process that present technique decompresses.
Fig. 8 has illustrated exponential sum mantissa data distribution situation.
Fig. 9 has illustrated Wave data section distribution situation.
5. specific implementation
The audio data collecting that relates in the patent of the present invention, compression, decompression process:
Among Fig. 1, the introduction of summary how to compress to data conversion from a sound collection devices collect data, up to merging into DAC document flow.At first, adopt hyperchannel absolute coding mode, carry out multi-channel data acquisition.Then, data are carried out the conversion of time domain to frequency domain, obtain frequency domain data.Next step is analyzed based on natural acoustic model mode, adopts exponential sum mantissa Coded Analysis method to the data Fuzzy processing.Continue above-mentioned steps, adopt the non-integral bit compression method that data are compressed.At last, the data block that processes is merged into data set,, write sync mark, write DAC file or DAC data stream at last again several data combinations and be a Frame.Below, will introduce each processing procedure in detail.
Fig. 2 has illustrated data acquisition: patent of the present invention adopts hyperchannel absolute coding mode to carry out a plurality of channel data collections, can gather 32 channel datas at most.So-called hyperchannel absolute coding is exactly the passage number of selecting according to the user, realizes that the data acquisition of passage is irrelevant mutually, and can gather compression simultaneously.If sound has a plurality of languages like this, such as national language and Guangdong language, when playing, play by selecting different passages, just can play different languages clearly.(data acquisition modes of prior art can only be gathered 1-8 channel data and the not independent differentiation of acquisition mode, so just can not realize multilingual broadcast.) Fig. 2 A shown each sound channel, Fig. 2 B has shown the voice data after each sound channel is gathered.
Fig. 3 has illustrated that time domain is to the frequency domain transfer process: this process is carried out segmentation (is a section such as 512 data) to data length in accordance with regulations.Owing to can produce difference between each data segment Wave data, as shown in Figure 9, can produce noise when playing like this.For addressing this problem, we have adopted the windowing method to eliminate difference between the data segment in Fig. 3 A, by Fourier transform among Fig. 3 B or MDCT conversion time domain data are converted to frequency domain data then.(the windowing method is exactly by producing a surplus profound functional value, then with each data segment value addition, eliminates the difference purpose thereby reach.)
Fig. 4 has illustrated data conversion process: after obtaining frequency domain data, adopt unique natural acoustic model mode that data are analyzed.
Nature acoustic model:, analyze high frequency data of people's ear susceptibility and the low frequency data of people's ear susceptibility according to the susceptibility of people's ear to sound frequency.The benefit of doing like this is by analyzing, can handle respectively different sensitivity datas, and the frequency data high susceptibility carry out refinement, the low frequency data of susceptibility are abandoned or compresses.So both guarantee the high-quality of sound, effectively controlled the size of data volume again.
By we analyze the data after the segmentation in Fig. 4 B then at first to the data segmentation among Fig. 4 A, at last confirm the responsive grade of each data segment and confirm mantissa's figure place of data at Fig. 4 C.
Analysis by to data has been divided into the zone to data by sound sensitive, just sets each regional mantissa according to sound sensitive.Mantissa is got in the zone that sound sensitive is high more, so that represent data accurately.Mantissa is got in the zone that sound sensitive is low less, so neither influences the quality of sound, has saved data space again.
After the data analysis, we take the exponent mantissa coding method that Fuzzy processing is carried out in each data area.
Specific practice is:
At first in Fig. 4 D, generate the exponent data section.The highest significant position of getting each data from the raw data section generates new data for as index according to this index.Such as data is that 13 binary forms are shown 1101, and its most significant data bits is 3 so, and the index of generation has been exactly 3.Generated new exponent data section by traveling through whole data segment.The length of this section is the same with the raw data segment length.
We generate mantissa data according to mantissa's figure place of data in Fig. 4 E then.Get the figure place of next significance bit of the highest significant position of raw data section earlier, it subtracts each other and obtains first mantissa value exponential sum, if there is not significance bit, then mantissa value is zero.Generated first mantissa data section like this.And the like, doing next time circulation and getting next mantissa and generate next mantissa data section.Last Fig. 4 F merged index data and mantissa data, result data is: exponent data section+mantissa data section 1+ mantissa data section 2+ ... mantissa data section n, n<32.Below we illustrate this process.
Such as, there is the one piece of data district to be shown [1101,0110,1000,0111,1001] for [13,6,8,7,9] binary form.At first we get index, by having calculated exponent data section [0011,0010,0011,0010,0011].Here we get two mantissa, begin circulation from a segments of source data time high position and obtain first mantissa data section
[0001,0001,0000,0001,0011] then takes out second mantissa data section [0011,0000,0000,0010,0000], data segment is merged to have obtained target data segment at last, sees Fig. 8.
Fig. 5 has illustrated the detailed process of data compression: after handling through data obfuscation, need carry out data compression to it, adopt the distinctive non-integral bit compression method of patent of the present invention at this.Its advantage is: store data as much as possible with the least possible binary digit.Such as can be with 5 of 8 binary digits storages smaller or equal to 2 number, commonsense method must be with 10 binary digits.The great like this data compression rate that improved.
Detailed process is: by in Fig. 5 A segments of source data being pressed certain-length and size of data division group recently mutually, the radix of this group is set in Fig. 5 B then, radix=group maximum number subtracts the group minimum number.Set radix, look into the storage bit number that the data bit allocation table comes specified data by Fig. 5 C.The data bit allocation table is a kind of data list structure that we set, and it stipulates data bits and the data number that each radix is shared, represents that such as: the list item of radix 25 numbers between the 0-2 store with 8 binary digits.Have 32 list items.
For example: top data segment is [3,2,3,3,1,1,0,1,3,3,0,0,2,0] through exponent mantissa coding back, and we can be divided into two groups to it, are respectively [3,3,3,3,3] and [2,2,2,1,1,1,0,0,0,0].First group radix is that 3, the second groups radix is 2.
After the grouping, in Fig. 5 D according to the storage bit number pooled data of data.Specific as follows:
We are provided with a buffer zone, and every group of data are read in one by one, at first check the radix of data place group.Tabling look-up according to radix obtains storage bit number and storage data number, checks then whether the data of buffer zone reach storage data number.If reach then these several numbers are merged into one group of binary number and write the sequence number and the group mark of table, this just forms compressed data set, last clear buffer.Otherwise, will count the adding buffer zone, continue the read next number.Repeat above-mentioned steps, all dispose up to all data.
For first group of radix is 3 o'clock, and the list item of learning radix 3 by tabling look-up is that 5 numbers between the 0-3 are stored with 9 binary digits.We can merge into this group number one group of number of 9 binary digits like this.For first group of radix is 2 o'clock, and the list item of learning radix 3 by tabling look-up is that 5 numbers between the 0-2 are stored with 8 binary digits, because this group number has ten, so we merge it at twice, handles 5 numbers at every turn, has formed the number of two groups of 8 system positions.We merge all compressed data set and have formed a target data block then.
Fig. 6 has illustrated the data merging process: by aforesaid operations, formed target data block.The number of target data block depends on the passage number of selection.Among Fig. 6 A several target data blocks are merged into data set.Among figure six B plurality of data combination and write synchronous mark and just become a Frame.Among Fig. 6 C Frame is saved in DAC file or DAC stream.So far, the DAC compression finishes.
Fig. 7 has illustrated decompression process, and it is the contrary operation of patent compress technique of the present invention.At first read in a DAC frame, be divided into the plurality of data group, again data set is divided into data block according to the Frame sign from file.Then, come the decompressed data piece by looking into data distribution list.Behind the decompress(ion), data are reduced successively.We have obtained final frequency domain data by reducing.Carry out the conversion of frequency domain data at last, obtained final playing audio-fequency data to time domain.Introduce each operating process below in detail.
Fig. 7 A Frame is disassembled process: at first read in a Frame from the DAC file, be divided into several data sets according to the Frame sign.Be data component target data block according to block mark then.
Fig. 7 B data decompression compression process: we read in each compressed data set one by one from target data block, go to table look-up according to the sequence number of the data bit allocation table that data set identified.The data number and the radix of this compressed data set have been obtained.Remove the binary number that merges with division then and decomposite each data, remove how many times and depend on this compressed data set number.
Fig. 7 C reduction of data process: by data decompression, we have obtained comprising the data segment of exponent data and mantissa data.Need reduce processing now.The data that we represent with Fig. 9 explain how to reduce.
At first, we read in the data of each exponent data section successively, according to their the new data of value generation.New data=2X, X are the data of each exponent data section, and the target data segment that obtains is [8,4,8,4,8], and we read in and get first mantissa's section then.Attention: the generation method and the exponent data of mantissa data are different, (index-mantissa) power of new data=2, and mantissa is not equal to zero.Mantissa's then new data that equal zero directly equal zero.We have obtained mantissa's target data segment [4,2,0,2,1] like this.It and target data segment added up obtained new target data segment [12,6,8,6,9].We read in next mantissa data section again and have obtained mantissa's target data segment [1,0,0,1,0] in order to last method.Add up with target data segment and to have obtained [13,6,8,7,9].The final data of transformation result that Here it is, we can compare with the process of above obfuscation.
Fig. 7 D is the transfer process of frequency domain to time domain: after reduction of data, we have obtained the frequency domain data section.We have adopted method in common to use Fourier transform or MDCT conversion that frequency domain data is transformed into time domain data.
At last, we send into passage to time domain data in Fig. 7 E, just can play by sound device.

Claims (8)

1.DAC voice data compression and decompression technique, be supported in sampling compression in the optional frequency between the 8K-1MHZ, support the nearly hyperchannel processing mode of 32 passages, make comparisons with identical bit rate 200K-300K (stereo), the acoustical quality of patent of the present invention can reach the former sound effective value of CD more than 95%.This technology may further comprise the steps:
Adopt hyperchannel absolute coding mode, carry out a plurality of channel data collections;
Data are carried out the conversion of time domain to frequency domain, obtain frequency domain data;
Analyze based on natural acoustic model mode, adopt exponential sum mantissa Methods for Coding the data Fuzzy processing;
Adopt the non-integral bit compression method that data are compressed;
Compressed modules merges, and obtains DAC file or DAC audio data stream;
Read in the Frame broadcast that decompresses from the DAC file.
2. according to the method for claim 1, wherein the step that " adopts hyperchannel absolute coding mode; carry out a plurality of channel data collections " comprises step: the passage number that the multi-channel audio collecting device is selected according to the user, and mutual irrelevant ground carries out the step of audio data collecting simultaneously between each passage;
3. according to the process of claim 1 wherein that the step of " analyze based on natural acoustic model mode, adopt exponential sum mantissa Coded Analysis method to the data Fuzzy processing " comprises step:
Analyze based on natural acoustic model mode;
Adopt exponential sum mantissa Coded Analysis method to the data Fuzzy processing.
4. according to the method for claim 3, wherein the step of " analyzing based on natural acoustic model mode " also comprises step:
Data are carried out segmentation;
Data segment is analyzed;
By the susceptibility height of people's ear, confirm the responsive grade of each data segment to sound.
5. according to the method for claim 3, wherein " adopt exponential sum mantissa Coded Analysis method to the data Fuzzy processing " step also comprise step:
Generate exponent data by the raw data section;
According to the sensitivity grade of data segment, generate mantissa data;
Merged index data segment and mantissa data section;
Draw the data block after the conversion.
6. according to the process of claim 1 wherein that the step of " adopting the non-integral bit compression method that data are compressed " comprises step:
Close with size of data is that principle is carried out packet;
Generate the radix of every group of data;
Look into the data bit allocation table according to the radix of every group of data;
According to stored data bit number of determining in the data bit allocation table and storage data number pooled data.
7. according to the process of claim 1 wherein that the step of " compressed modules merges, and obtains DAC file or DAC audio data stream " comprises step:
Data block is merged into data set, and the number of data block depends on the number of sampling channel;
Generate target set of data;
Several data combinations are Frame also;
Generate the target data frame;
The target data frame is saved as the DAC file.
8. according to the process of claim 1 wherein that the step of " reading in the Frame broadcast that decompresses from the DAC file " comprises step:
Read in Frame and disassemble from the DAC file;
Method at the non-integral bit compression decompresses;
The data segment that comprises exponent data and mantissa data is reduced processing;
The conversion of frequency domain to time domain;
Time domain data is sent into passage, just can play by sound device.
CNA031093183A 2003-04-07 2003-04-07 DAC voice data compression and uncompression technique Pending CN1536557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA031093183A CN1536557A (en) 2003-04-07 2003-04-07 DAC voice data compression and uncompression technique

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA031093183A CN1536557A (en) 2003-04-07 2003-04-07 DAC voice data compression and uncompression technique

Publications (1)

Publication Number Publication Date
CN1536557A true CN1536557A (en) 2004-10-13

Family

ID=34319287

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA031093183A Pending CN1536557A (en) 2003-04-07 2003-04-07 DAC voice data compression and uncompression technique

Country Status (1)

Country Link
CN (1) CN1536557A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103021415A (en) * 2012-12-12 2013-04-03 青岛天信通软件技术有限公司 Digital-to-analog converter (DAC) voice data compression and uncompression technique
CN105978611A (en) * 2016-05-12 2016-09-28 京信通信系统(广州)有限公司 Frequency domain signal compression method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103021415A (en) * 2012-12-12 2013-04-03 青岛天信通软件技术有限公司 Digital-to-analog converter (DAC) voice data compression and uncompression technique
CN105978611A (en) * 2016-05-12 2016-09-28 京信通信系统(广州)有限公司 Frequency domain signal compression method and device
CN105978611B (en) * 2016-05-12 2019-09-17 京信通信系统(中国)有限公司 A kind of frequency-region signal compression method and device

Similar Documents

Publication Publication Date Title
CN102368385B (en) Backward block adaptive Golomb-Rice coding and decoding method and apparatus thereof
CN1822508B (en) Method and apparatus for encoding and decoding digital signals
CN1784020A (en) Apparatus, method,and medium for processing audio signal using correlation between bands
CN1905373A (en) Method for implementing audio coder-decoder
CN1262990C (en) Audio coding method and apparatus using harmonic extraction
US8687818B2 (en) Method for dynamically adjusting the spectral content of an audio signal
CN1866355A (en) Audio coding apparatus and audio decoding apparatus
CN1112799A (en) Method and apparatus for encoding digital signals, method and apparatus for decoding the coded signals, and medium for recording the coded signals
CN1922657A (en) Decoding scheme for variable block length signals
CN1112769C (en) Digital data encoding apparatus and method thereof
CN1848691A (en) Apparatus and method for processing acoustical-signal
CN1717718A (en) Sinusoidal audio coding
CN1737791A (en) Data compression method by finite exhaustive optimization
CN1106711C (en) D/A converter noise reduction system
CN1266672C (en) Audio decoding method and apparatus for reconstructing high frequency components with less computation
CN1487746A (en) Method and equipment for coding or decoding audio signal
CN1664917A (en) Apparatus and method for synthesizing MIDI based on wave table
CN1536557A (en) DAC voice data compression and uncompression technique
CN103021415A (en) Digital-to-analog converter (DAC) voice data compression and uncompression technique
CN1111960C (en) Digital data coding device and method thereof
CN102332266B (en) Audio data encoding method and device
CN101814289A (en) Digital audio multi-channel coding method and system of DRA (Digital Recorder Analyzer) with low bit rate
CN1071769A (en) The method that a kind of voice signal to the people is encoded and deciphered
CN1182488C (en) Data compression method and image data compression equipment
CN1547193A (en) Invariant codebook fast search algorithm for speech coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication