CN109243471B - Method for quickly coding digital audio for broadcasting - Google Patents
Method for quickly coding digital audio for broadcasting Download PDFInfo
- Publication number
- CN109243471B CN109243471B CN201811124426.2A CN201811124426A CN109243471B CN 109243471 B CN109243471 B CN 109243471B CN 201811124426 A CN201811124426 A CN 201811124426A CN 109243471 B CN109243471 B CN 109243471B
- Authority
- CN
- China
- Prior art keywords
- data
- audio
- pcm
- frame
- pcm data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
Abstract
The invention discloses a method for rapidly coding digital audio for broadcasting, which comprises the following steps: s1, converting the source audio file into PCM data; s2, dividing the PCM data into a plurality of blocks, and labeling each block of data in sequence; s3, sending the segmented data to a CPU or a GPU for parallel coding operation; and S4, combining the coded audio data according to the label sequence to generate a final digital audio file. According to the scheme, the data are sent to the processor for parallel processing after being segmented, and under the condition that the current processor is basically a multi-core processor, each core can process one piece of data, multithreading is concurrent, and the processing speed is effectively improved. The scheme is suitable for large audio data file coding and decoding processing used in occasions such as broadcasting and the like.
Description
Technical Field
The invention relates to the technical field of digital audio coding and decoding, in particular to a method for quickly coding digital audio for broadcasting, which can be used for parallel processing.
Background
The audio data files for broadcasting need to be coded and decoded, and when the audio files are large, such as 24-hour audio, the coding and decoding process takes a long time, and the utilization rate of a processor is not high.
Disclosure of Invention
The invention mainly solves the technical problems of long time consumption and low efficiency of audio file coding in the prior art, and provides a method for quickly coding digital audio for broadcasting, which can make full use of a multi-core CPU or GPU and has high processor utilization rate.
The invention mainly solves the technical problems through the following technical scheme: a method for rapidly encoding digital audio for broadcasting, comprising the steps of:
s1, converting the source audio file into PCM data;
s2, dividing the PCM data into a plurality of blocks, and labeling each block of data in sequence;
s3, sending the segmented data to a CPU or a GPU for parallel coding operation;
and S4, combining the coded audio data according to the label sequence to generate a final digital audio file.
According to the scheme, the data are sent to the processor for parallel processing after being segmented, and under the condition that the current processor is basically a multi-core processor, each core can process one piece of data, multithreading is concurrent, and the processing speed is effectively improved.
Preferably, the step S1 is specifically:
s101, judging whether the source audio is PCM data, if so, jumping to a step S103, otherwise, entering the step S102:
s102, decoding the source audio data to generate PCM data, and then entering the step S103;
s103, judging whether the sampling rate, the bit depth and the channel number of the PCM data and the target audio MP2 are consistent, if any one of the parameters is inconsistent, the step S104 is carried out, and if all the parameters are consistent, the step S2 is carried out; the sampling rate, bit depth and number of channels of the target audio MP2 are parameters input by human or program defaults before coding;
s104, resampling and re-quantizing the source data, and then proceeding to step S2. The resampled and requantized data is PCM data and no decoding operation is required.
Preferably, in step S1, the source audio is audio data that can be decoded universally using ffmpeg or libav open source library to generate PCM data.
Preferably, the size S of each block of data chunk Determined by the following set of equations:
where P is the minimum period value of frame padding, C f Is the number of sample information contained in a unit frame, N bitdepth Is bit depth, N channel For the number of channels, ceil (float) is an upward rounding function, S pcm For the total size of PCM data, S frame Is the unit frame data size. C f : fixed to 384 sample information per frame for MP1, and fixed to 1152 sample information per frame for MP 2. N is a radical of bitdepth : the bit depth is a parameter input by people or programs in default before coding, and is generally 16bits by default. N is a radical of channel : the number of channels is a parameter that is considered or input by default before encoding, typically stereo, i.e. 2 channels.
Preferably, the minimum period value P of the frame padding is determined by the following set of equations:
in the formula, R b Is bit rate, S s Number of bytes occupied by unit slot, f s As a sampling rate, gcd (number) 1 ,number 2 ) To ask for number 1 ,number 2 The greatest common divisor function of (d). R b : the bit rate can be obtained by parameters input manually or by a program before encoding, or can be obtained by calculating parameters such as the number of sample points of a unit frame, the sampling rate, the number of channels and the like, and is generally used as an input item. S s : one slot takes 4 bytes for MP1 and 1byte for MP 2. f. of s : the sampling rate is a parameter that is input manually or programmatically prior to encoding.
Preferably, the minimum period value P of the frame padding is determined by the following formula:
in the formula, R b Is bit rate, S s Number of bytes occupied by unit slot, f s For the sampling rate, lcm (number) 1 ,number 2 ) To ask for number 1 ,number 2 Is the least common multiple function of.
The invention has the substantial effects of shortening the encoding and decoding process and improving the utilization rate of the processor.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The technical scheme of the invention is further specifically described by the following embodiments and the accompanying drawings.
Example (b): the method for rapidly coding the digital audio for broadcasting of the embodiment comprises the following steps:
step one, decoding into PCM data: and after decoding the source audio file into PCM data, performing the second step.
Dividing the PCM data into N blocks: the minimum period value P of the frame filling is calculated according to "formula set 1" or "formula set 2". Then, the PCM data is divided into N blocks (corresponding to N threads) according to a formula set 3, and the size of each block is S chunk (the last block size is not always equal to S chunk ) And after labeling is carried out according to the sequence, the third step is carried out.
The algorithm formula is as follows:
formula set 1:
formula set 2:
formula group 3
The symbols in the formula:
p: minimum period value of frame padding.
lcm(number 1 ,number 2 ): number of mathematical functions 1 ,number 2 The least common multiple of.
gcd(number 1 ,number 2 ): number of mathematical functions 1 ,number 2 The greatest common divisor of (c).
f s : sampling rate in Hz; typically 32KHz, 44.1KHz, 48KHz, etc.
C f : the number of sample information contained in a unit frame.
S s : the number of bytes occupied by the unit slot is 1Byte ═ 8 bit.
R b : the bit rate, unit is bit/s.
S pcm : the total size of the PCM data, i.e. the total number of bytes, is in bytes.
N bitdepth : bit depths, such as: 8(bit), 16(bit), 24(bit), 32 (bit).
N channel : the number of channels, for example: mono is 1 and stereo/binaural is 2.
Ceil (float): mathematical function, rounding up.
S frame : the unit frame data size is in bytes.
S chunk : PCM block data size, in bytes.
N: and (3) parallelly processing the opened number of threads, wherein the number of the threads belongs to the range of [1 × processor core number, 1.5 × processing core number ], and the CPU utilization rate can be up to 100%.
*: the multiplication operator.
Percent: the modulus operator.
And (step three), sending the audio data into a CPU or a GPU for parallel coding operation, combining the coded audio data according to the label sequence (combining in a multithreading way), and generating a final digital audio file.
Fig. 1 is a flowchart of the present embodiment.
Examples are as follows:
testing PCM original audio data, and transcoding into MP 2:
(1) and (3) system environment:
operating the system: windows Server 2008R2x64SP1
A processor: inter (R) core (TM) i5-4590CPU @3.30GHz
Memory (RAM): 12G
(2) Source audio:
PCM, 48000Hz, 16bit, 2 channel, duration 1:00:00, 691,200,000 bytes.
(3) Target audio:
MP2, 48000Hz, 256000bit/s, 2 channels, duration 1:00:00, 115,200,000 bytes.
And (3) testing results:
(1) in the test process, the test PC simultaneously runs other resource-consuming programs such as a development tool and the like, and the result is only used for showing that the multi-thread coding is obviously accelerated compared with the single thread.
(2) In addition, different audio codecs are packed differently, and for ffmpeg as an example, when PCM data of each block (except for the first block) is encoded, the encoded first frame needs to be "re-encoded" or "take 1 more frame data and discard the frame data".
(3) GPU coding: taking NVIDIAGEFORCE GTX 1080Ti video card as an example, the core number is 3584, and the acceleration frequency is 1582 MHz. 3584 threads may be turned on simultaneously for parallel processing. Then the magnitude of the increase in speed of the GPU would be theoretically more pronounced.
Related terms
PCM pulse code modulation: pulse-code modulation (PCM) is a method of digitizing analog signals.
Bit stream Bit: in GB/T17191, the bitstream is an encoded representation of an audio signal.
Encoding: encoding is the process by which information is converted from one form or format to another. There is no process specified in GB/T17191 for reading a stream of input audio samples to produce an efficient bit stream in accordance with the definition in GB/T17191.
Decoding: decoding is the inverse of encoding. The process defined in GB/T17191 reads the encoded bitstream and produces decoded audio sample values.
Sound Channel: the number of channels is the number of sound sources during recording or playing back the sound, or the number of corresponding speakers during playing back the sound.
Sample rate: the sampling rate, also called sampling speed, sampling frequency, defines the samples per second extracted from a continuous signal and constituting a discrete signalNumber in hertz (Hz). The commonly used expression symbol is f s 。
Bit rate: the bit rate, i.e. the bit rate, also called bit rate, code rate, is the number of bits transmitted or processed per unit time. The units of "bits per second" (bit/s or bps) are used. Available symbols R b And (4) showing.
Bit depth: in using PCM digital audio, the bit depth is the number of bits of information in each sample, which directly corresponds to the resolution of each sample. Examples of bit depths include 16bits per sample for digital audio on a disk, and up to 24bits per sample for DVD audio and blu-ray disks. The meaning is basically consistent with the weighing precision, the quantization digit, the sampling precision, the sampling digit and the like. The bit depth is only meaningful for PCM digital signals. non-PCM formats (e.g., lossy compression formats) have no associated bit depth.
Layer: in GB/T17191, a layer is one of the audio system coding layers.
Audio access unit: in GB/T17191, for layers i and ii, an audio access unit is defined as the smallest part of the encoded bitstream that can be decoded by itself. Where decoding refers to "fully reconstructed sound".
Frame of the Frame: in GB/T17191, the portion of the audio signal corresponding to the audio PCM samples from the audio access unit. Number of available symbols C containing sampling point information in one frame f And (4) showing.
Groove: in GB/T17191, a slot is an essential part of the bit stream. In layer i, one slot is 4 bytes; in layer II, one slot is 1 byte. Symbol S available in number of bytes of one slot s And (4) showing.
Padding: in GB/T17191 the average time length of an audio frame is adjusted to fit the duration of the corresponding PCM data sample value by conditionally adding a slot to the audio frame.
Least common multiple lcm: shorthand for lowest common multiple. lcm (number) 1 ,number 2 ,...,number n ) Number of 1 ,number 2 ,...,number n The least common multiple of.
Greatest common divisor gcd: a shorthand for the greatest common divsor, also known as the greatest common factor (gch). gcd (number) 1 ,number 2 ,...,number n ) Number of 1 ,number 2 ,...,number n The greatest common divisor of (c).
The scheme has no strict format requirement on the source audio, and can finish general decoding by using an open source library such as ffmpeg/libav and the like to generate PCM data as long as the audio format is commonly used.
The basis for judging whether the decoded PCM data needs resampling and requantization is as follows: the PCM data is identical to parameters such as the sampling rate, bit depth, and number of channels of the target audio MP2, and if they are identical, they are not required, and if they are not identical, they are required.
Parallel encoding: starting N threads for parallel processing, dividing PCM data into N blocks, each block having a size of S chunk (the last block size is not always equal to S chunk ). The MP2 encoding process is general encoding and can be completed by using an open source library such as ffmpeg/libav.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.
Although the terms PCM, block, frame, etc. are used more herein, the possibility of using other terms is not excluded. These terms are used merely to more conveniently describe and explain the nature of the present invention; they are to be construed as being without limitation to any additional limitations that may be imposed by the spirit of the present invention.
Claims (5)
1. A method for rapidly encoding digital audio for broadcasting, comprising the steps of:
s1, converting the source audio file into PCM data;
s2, dividing the PCM data into a plurality of blocks, and labeling each block of data in sequence;
s3, sending the segmented data to a CPU or a GPU for parallel coding operation;
s4, merging the coded audio data according to the label sequence to generate a final digital audio file;
in step S2, the size S of each block of data chunk Determined by the following set of equations:
wherein P is the minimum period value of frame padding, C f Is the number of sample information contained in a unit frame, N bitdepth Is bit depth, N channel For the number of channels, ceil (float) is an upward rounding function, S pcm For the total size of PCM data, S frame Is the unit frame data size.
2. The method as claimed in claim 1, wherein the step S1 is specifically performed by:
s101, judging whether the source audio is PCM data, if so, jumping to a step S103, otherwise, entering the step S102:
s102, decoding the source audio data to generate PCM data, and then entering the step S103;
s103, judging whether the sampling rate, the bit depth and the channel number of the PCM data and the target audio MP2 are consistent, if any one of the parameters is inconsistent, the step S104 is carried out, and if all the parameters are consistent, the step S2 is carried out;
s104, resampling and requantizing the source data, and then proceeding to step S2.
3. The method of claim 1 or 2, wherein the source audio is audio data that can be decoded commonly using ffmpeg or libav open source library to generate PCM data in step S1.
4. The method of claim 1, wherein the minimum period value P of the frame padding is determined by the following formula:
in the formula, R b Is bit rate, S s Number of bytes occupied per slot, f s As a sampling rate, gcd (number) 1 ,number 2 ) To ask for number 1 ,number 2 The greatest common divisor function of (d).
5. The method of claim 1, wherein the minimum period value P of the frame padding is determined by the following formula:
in the formula, R b Is bit rate, S s Is a byte occupied by a unit slotNumber f s For the sampling rate, lcm (number) 1 ,number 2 ) To ask for number 1 ,number 2 Is the least common multiple function of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811124426.2A CN109243471B (en) | 2018-09-26 | 2018-09-26 | Method for quickly coding digital audio for broadcasting |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811124426.2A CN109243471B (en) | 2018-09-26 | 2018-09-26 | Method for quickly coding digital audio for broadcasting |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109243471A CN109243471A (en) | 2019-01-18 |
CN109243471B true CN109243471B (en) | 2022-09-23 |
Family
ID=65056230
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811124426.2A Active CN109243471B (en) | 2018-09-26 | 2018-09-26 | Method for quickly coding digital audio for broadcasting |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109243471B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110689876B (en) * | 2019-10-14 | 2022-04-12 | 腾讯科技(深圳)有限公司 | Voice recognition method and device, electronic equipment and storage medium |
CN112380173B (en) * | 2020-11-20 | 2023-10-20 | 中国直升机设计研究所 | Intelligent correction rapid PCM decoding calculation method |
CN112767920A (en) * | 2020-12-31 | 2021-05-07 | 深圳市珍爱捷云信息技术有限公司 | Method, device, equipment and storage medium for recognizing call voice |
CN113747236A (en) * | 2021-10-19 | 2021-12-03 | 江下信息科技(惠州)有限公司 | Multithreading-based high-speed audio format conversion method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4922537A (en) * | 1987-06-02 | 1990-05-01 | Frederiksen & Shu Laboratories, Inc. | Method and apparatus employing audio frequency offset extraction and floating-point conversion for digitally encoding and decoding high-fidelity audio signals |
CN1848241A (en) * | 1995-12-01 | 2006-10-18 | 数字剧场系统股份有限公司 | Multi-channel audio frequency coder |
CN101027717A (en) * | 2004-03-25 | 2007-08-29 | Dts公司 | Lossless multi-channel audio codec |
EP2259634A2 (en) * | 1995-06-30 | 2010-12-08 | Interdigital Technology Corporation | Code acquisition in a CDMA communication system |
CN104795073A (en) * | 2015-03-26 | 2015-07-22 | 无锡天脉聚源传媒科技有限公司 | Method and device for processing audio data |
CN107731238A (en) * | 2016-08-10 | 2018-02-23 | 华为技术有限公司 | The coding method of multi-channel signal and encoder |
CN108550369A (en) * | 2018-04-14 | 2018-09-18 | 全景声科技南京有限公司 | A kind of panorama acoustical signal decoding method of variable-length |
-
2018
- 2018-09-26 CN CN201811124426.2A patent/CN109243471B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4922537A (en) * | 1987-06-02 | 1990-05-01 | Frederiksen & Shu Laboratories, Inc. | Method and apparatus employing audio frequency offset extraction and floating-point conversion for digitally encoding and decoding high-fidelity audio signals |
EP2259634A2 (en) * | 1995-06-30 | 2010-12-08 | Interdigital Technology Corporation | Code acquisition in a CDMA communication system |
CN1848241A (en) * | 1995-12-01 | 2006-10-18 | 数字剧场系统股份有限公司 | Multi-channel audio frequency coder |
CN101027717A (en) * | 2004-03-25 | 2007-08-29 | Dts公司 | Lossless multi-channel audio codec |
CN104795073A (en) * | 2015-03-26 | 2015-07-22 | 无锡天脉聚源传媒科技有限公司 | Method and device for processing audio data |
CN107731238A (en) * | 2016-08-10 | 2018-02-23 | 华为技术有限公司 | The coding method of multi-channel signal and encoder |
CN108550369A (en) * | 2018-04-14 | 2018-09-18 | 全景声科技南京有限公司 | A kind of panorama acoustical signal decoding method of variable-length |
Non-Patent Citations (2)
Title |
---|
Tian Bo.A GOP Size Control Method for Distributed Video Coding.《 2015 8th International Conference on Intelligent Computation Technology and Automation (ICICTA)》.2016,全文. * |
段敏涛.基于定点DSP的G.723.1实时语音编码器的优化设计与实现研究.《中国优秀硕士学位论文全文数据库》.2006,(第5期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN109243471A (en) | 2019-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109243471B (en) | Method for quickly coding digital audio for broadcasting | |
EP3114681B1 (en) | Post-encoding bitrate reduction of multiple object audio | |
US8386271B2 (en) | Lossless and near lossless scalable audio codec | |
KR100561869B1 (en) | Lossless audio decoding/encoding method and apparatus | |
JP5688861B2 (en) | Entropy coding to adapt coding between level mode and run length / level mode | |
JP5400143B2 (en) | Factoring the overlapping transform into two block transforms | |
JP2001094433A (en) | Sub-band coding and decoding medium | |
AU2015235133B2 (en) | Audio decoding device, audio encoding device, audio decoding method, audio encoding method, audio decoding program, and audio encoding program | |
KR20070059849A (en) | Method and apparatus for encoding/decoding audio signal | |
KR20060084497A (en) | Method and apparatus for encoding and decoding of digital signals | |
JP2020512587A (en) | System and method for processing audio data | |
US11869523B2 (en) | Method and apparatus for decoding a bitstream including encoded higher order ambisonics representations | |
KR20200012861A (en) | Difference Data in Digital Audio Signals | |
JP2003523535A (en) | Method and apparatus for converting an audio signal between a plurality of data compression formats | |
JP2007142547A (en) | Coding method and decoding method, and coder and decoder employing same | |
CN111866542B (en) | Audio signal processing method, multimedia information processing device and electronic equipment | |
US10199043B2 (en) | Scalable code excited linear prediction bitstream repacked from a higher to a lower bitrate by discarding insignificant frame data | |
JP7318645B2 (en) | Encoding device and method, decoding device and method, and program | |
EP1420401A1 (en) | Method and apparatus for converting a compressed audio data stream with fixed frame length including a bit reservoir feature into a different-format data stream | |
CN1826635A (en) | Audio file format conversion | |
CN115312069A (en) | Audio encoding and decoding method and device, computer readable medium and electronic equipment | |
JP2001094432A (en) | Sub-band coding and decoding method | |
KR20100008312A (en) | Encoder and decoder for encoding/decoding location information about important spectral component of audio signal | |
KR20140027831A (en) | Audio signal transmitting apparatus and method for transmitting audio signal, and audio signal receiving apparatus and method for extracting audio source thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |