CN112420059A - Audio coding quantization control method combining code rate layering and quality layering - Google Patents

Audio coding quantization control method combining code rate layering and quality layering Download PDF

Info

Publication number
CN112420059A
CN112420059A CN202011105481.4A CN202011105481A CN112420059A CN 112420059 A CN112420059 A CN 112420059A CN 202011105481 A CN202011105481 A CN 202011105481A CN 112420059 A CN112420059 A CN 112420059A
Authority
CN
China
Prior art keywords
code rate
quality
acrf
layering
rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011105481.4A
Other languages
Chinese (zh)
Other versions
CN112420059B (en
Inventor
梅元刚
刘宇新
朱政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Microframe Information Technology Co ltd
Original Assignee
Hangzhou Microframe Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Microframe Information Technology Co ltd filed Critical Hangzhou Microframe Information Technology Co ltd
Priority to CN202011105481.4A priority Critical patent/CN112420059B/en
Publication of CN112420059A publication Critical patent/CN112420059A/en
Application granted granted Critical
Publication of CN112420059B publication Critical patent/CN112420059B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0004Design or structure of the codebook

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses an audio coding quantization control method combining code rate layering and quality layering, and belongs to the field of audio coding. The method comprises the following steps: firstly, pre-coding audio according to a predicted quality control factor (acrf) and an initial code rate layer; and then, adjusting the code rate according to the coding result, so that the BDrate (corresponding relation representing code rate consumption and quality improvement) reaches the highest at the code rate, the relative quality is the best in an interval, and a reasonable linear mapping mode between the quality control factor and the code rate is obtained, thereby balancing the quality and adjusting the code rate corresponding to the quality.

Description

Audio coding quantization control method combining code rate layering and quality layering
Technical Field
The invention relates to the technical field of audio coding, in particular to an audio coding quantization control method combining code rate layering and quality layering.
Background
The main purpose of audio coding is to remove the statistical redundancy and the perceptual redundancy of the input signal to the maximum extent to realize the compression of data volume under the premise of ensuring certain subjective hearing quality, so as to meet the requirements under different transmission and storage conditions.
In the following scenarios, there is a need for layered control of audio bitrate, and it is desirable to apply different bitrate under different conditions and maintain the audio quality under each condition as much as possible.
1) The audio content is subjected to slimming and is stored according to quality grading, so that the storage space is saved and the audio quality is maintained as much as possible.
2) In the face of online voice interaction:
scenes including real-time communication, live broadcasting, mobile-end on-demand, large-scale audio service system online voice forwarding and the like;
there is a need to both conserve traffic and control concurrent bandwidth and keep services stable while maintaining audio quality.
The limitation of the current method is that:
when the audio is coded by using a Constant Bit Rate (CBR), the bit rate is effectively controlled, but the quality and the details of the audio cannot be guaranteed while the bit rate is reduced. The code rate is limited for the content scene without the quality reduction, and although the code rate is effectively controlled, the code rate is not accurately controlled in different code rate intervals and has certain jitter; the quality is unstable, and the fluctuation is large; the quality of certain sound scenes (speech, music and the like) is damaged greatly, and the subjective effect is poor.
When the audio is coded by using the Variable Bit Rate (VBR), the quality can be relatively kept stable, but the whole bit rate is large, the bit rate cannot be accurately controlled, the hierarchical control of different bit rates is lacked, and the specific expression is that the interval of the variable bit rate is narrow; for different contents, certain contents or certain code rate intervals, the code rate has larger fluctuation, and the contribution of the large code rate to the quality is not large; the requirements of code rate layering and stable control quality of actual products cannot be met.
When the audio coding is performed, the Constant Bit Rate (CBR) and the Variable Bit Rate (VBR) cannot achieve linear control of the bit rate and guarantee the stability of the quality for different tasks.
Disclosure of Invention
In view of the above disadvantages, the present invention provides a method for audio coding quantization control combining rate layering and quality layering. The core idea of the invention is to find a reasonable linear mapping mode by combining the code rate and the quality quantization control factor (ACRF), set a reasonable code rate control interval and a boundary point of the quality control interval, and realize the balance of quality and simultaneously adjust the code rate according to the quality requirement.
The invention provides a method for audio coding quantization control by combining code rate layering and quality layering, which comprises the following steps:
(1) and establishing an initial code rate grading code table according to the sampling rate of the input audio, wherein the grading code table comprises the sampling rate, the single-track code rate and the double-track code rate.
(2) And determining the code rate of which gear is adopted by the input audio according to the frame length of the input audio, the sampling rate and the code rate requirement of each channel.
(3) And determining a target quality hierarchy and a corresponding code rate hierarchy.
(4) And finding out the corresponding controlled code rate interval boundary according to different quality control factors ACRF and an initial code rate grading code table, and determining the boundary point of the quality layering and the code rate layering.
(5) And (5) pre-coding the ACRF and the code rate determined in the step (4) to obtain a pre-coded code rate range, namely an actual code rate range corresponding to the quality.
(6) Calculating a final coding quality quantization factor according to linear mapping according to the obtained optimal quality quantization factor acrf of each grade, the minimum quality quantization factor min _ f and the maximum quality quantization factor max _ f:
acrf= min_f + (max_f - min_f)×(max_f - min_f) ×acrf / (max_acrf - min_acrf)。
the code rate and the quality are controlled by using the acrf, the interval is (min _ acrf, max _ acrf), and the acrf is the currently selected quality gear.
(7) And (5) coding by using the code rate corresponding to the acrf obtained in the step (6): carrying out overall code rate smoothing control at a frame coding level; and generating code rate deviation correction and summarizing the code rate deviation correction to a trend table to be used as reference in the next control period and fine control of a frame level, scoring the coded audio through PESQ, wherein the quality deviation PESQ (scoring) is within 0.15, namely, the deviation correction is stopped, which shows that the BDrate (corresponding relation representing code rate consumption and quality improvement) is the highest under the code rate and the relative quality is the best in an interval.
(8) Automatically classifying the sampling rate, the code rate of the original audio and the number of sound channels according to a global summary trend table, and matching the fastest initial code rate file and the fastest quality file to form a quality curve and a code rate table; the initial code rate file refers to an initial code rate grading code table established according to the sampling rate in the step (1), and the quality file refers to quality layering in the step (3) and the step (4).
Drawings
Fig. 1 is a flowchart of a method for audio coding quantization control combining rate layering and quality layering according to the present invention.
FIG. 2 is a schematic diagram of a mass curve.
Fig. 3 is a code rate table.
Detailed Description
In order to make the technical solutions in the present specification better understood, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in one or more embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present specification without any creative effort shall fall within the protection scope of the present specification.
The present invention will be further described with reference to the accompanying drawings.
As shown in fig. 1, a method for controlling quantization in audio coding combining rate layering and quality layering according to the present invention includes the following steps:
(1) an initial code rate grading code table is established according to the sampling rate of the input audio, the grading code table comprises a sampling rate, a single-channel code rate and a two-channel code rate, the preferred sampling rate is 0, 12000, and 576001, the corresponding single-channel code rate is 3700, 5000, and 17000, and the two-channel code rate is 5000, 6400, and 17000, which is specifically as follows:
{0, 3700, 5000}
{12000, 5000, 6400}
{20000, 6900, 9640}
{28000, 9600, 13050}
{40000, 12060, 14260}
{56000, 13950, 15500}
{72000, 14200, 16120}
{96000, 17000, 17000}
{576001, 17000, 17000}。
(2) determining the code rate of which gear is adopted by the input audio according to the frame length and the sampling rate of the input audio and the code rate requirement of each channel; if a code rate file with the sampling rate of 40000 is selected: {40000, 12060, 14260}.
(3) Target quality hierarchies and corresponding rate hierarchies are determined, with preferred quality hierarchies being { acrf _ q0, acrf _ q1, acrf _ q2, acrf _ q3, acrf _ q4, acrf _ q5, acrf _ q6}, and corresponding rate hierarchies being {26kbps, 32kbps, 40kbps, 50kbps, 60kbps, 80kbps, 100kbps }.
(4) According to different quality control factors ACRF and an initial code rate grading code table, finding a corresponding controlled code rate interval boundary, and determining a boundary point of quality grading and code rate grading, wherein for example, the quality grade of the current audio is ACRF _ q0, and if the ACRF =50, the corresponding code rate controlled boundary is (26kbps, 32 kbps).
(5) Pre-coding the ACRF and the code rate determined in the step (4) to obtain a pre-coded code rate range, namely an actual code rate range corresponding to the quality; for example: the value of aclf =51 of the aclf _ q0 is a suitable value, and the code rate interval corresponding to it is (27 kbps, 30 kbps), because (27 kbps, 30 kbps) is within the (26kbps, 32kbps) interval, the aclf =51 is considered as the optimum aclf value within the aclf _ q 0.
(6) Calculating a final coding quality quantization factor according to linear mapping according to the obtained optimal quality quantization factor acrf of each grade, the minimum quality quantization factor min _ f and the maximum quality quantization factor max _ f:
acrf= min_f + (max_f - min_f)×(max_f - min_f) ×acrf / (max_acrf - min_acrf)
and controlling the code rate and the quality by using the acrf, wherein the min _ acrf is the minimum value of the whole quality quantization factor interval, preferably 1, and the max _ acrf is the maximum value of the whole quality quantization factor interval, preferably 51, the interval is (min _ acrf, max _ acrf), and the acrf is the currently selected quality gear.
(7) And (5) coding by using the code rate corresponding to the acrf obtained in the step (6): carrying out overall code rate smoothing control at a frame coding level; and generating code rate deviation correction and summarizing the code rate deviation correction to a trend table to be used as reference in the next control period and fine control of a frame level, scoring the coded audio through PESQ, wherein the quality deviation PESQ (scoring) is within 0.15, namely, the deviation correction is stopped, which shows that the BDrate (corresponding relation representing code rate consumption and quality improvement) is the highest under the code rate and the relative quality is the best in an interval.
(8) Automatically classifying the sampling rate, the code rate of the original audio and the number of sound channels according to a global summary trend table, and matching the fastest initial code rate file and the fastest quality file to form a quality curve and a code rate table; the initial code rate file refers to an initial code rate grading code table established according to the sampling rate in the step (1), and the quality file refers to quality layering in the step (3) and the step (4). The overall correspondence is as in the quality curve of fig. 2 and the code rate table of fig. 3.
The invention combines the code rate and the quality quantization control factor ACRF, and achieves ideal quality layering and corresponding code rate layering through a reasonable linear mapping mode of the code rate and the quality quantization control factor ACRF, and can adjust the code rate corresponding to the quality while balancing the quality, and the quality curve shown in figure 2 and the code rate table shown in figure 3 are obtained through simulation experiments, and the quality curve and the code rate table can intuitively reflect the code rate-quality layering and the mapping relation.
In the face of different audio coding tasks, different quality gears can be selected according to different contents, code rate layering is carried out according to the different quality gears, the code rate is accurately controlled, in each quality gear, the code rate control interval is small, and corresponding quality fluctuation is small, so that the code rate is saved to the maximum extent while the quality is kept stable, and the method has great value for audio-related service providers and operators.
The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (1)

1. A method for audio coding quantization control combining code rate layering and quality layering, characterized by comprising the steps of:
(1) establishing an initial code rate grading code table according to the sampling rate of the input audio, wherein the grading code table comprises the sampling rate, the single track code rate and the double track code rate;
(2) determining the code rate of which gear is adopted by the input audio according to the frame length and the sampling rate of the input audio and the code rate requirement of each channel;
(3) determining a target quality layer and a corresponding code rate layer;
(4) finding out the corresponding controlled code rate interval boundary according to different quality control factors ACRF and an initial code rate grading code table, and determining a boundary point of quality layering and code rate layering;
(5) pre-coding the acrf and the code rate determined in the step (4) to obtain a pre-coded code rate range, namely an actual code rate range corresponding to the quality;
(6) calculating a final coding quality quantization factor according to linear mapping according to the obtained optimal quality quantization factor acrf of each grade, the minimum quality quantization factor min _ f and the maximum quality quantization factor max _ f:
acrf= min_f + (max_f - min_f) * (max_f - min_f)* acrf / (max_acrf - min_acrf)
wherein min _ acrf is the minimum value of the whole quality quantization factor interval, preferably 1, max _ acrf is the maximum value of the whole quality quantization factor interval, preferably 51, code rate and quality are controlled by using the acrf, the interval is (min _ acrf, max _ acrf), and the acrf is the currently selected quality gear;
(7) and (5) coding by using the code rate corresponding to the acrf obtained in the step (6): carrying out overall code rate smoothing control at a frame coding level; generating code rate deviation correction and summarizing the code rate deviation correction to a trend table to be used as reference in the next control period and fine control of a frame level, scoring the coded audio through PESQ, stopping deviation correction when the quality deviation PESQ is within 0.15, and showing that the BDrate is highest under the code rate and the relative quality is best in an interval;
(8) automatically classifying the sampling rate, the code rate of the original audio and the number of sound channels according to a global summary trend table, and matching the fastest initial code rate file and the fastest quality file to form a quality curve and a code rate table; the initial code rate file refers to an initial code rate grading code table established according to the sampling rate in the step (1), and the quality file refers to quality layering in the step (3) and the step (4).
CN202011105481.4A 2020-10-15 2020-10-15 Audio coding quantization control method combining code rate layering and quality layering Active CN112420059B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011105481.4A CN112420059B (en) 2020-10-15 2020-10-15 Audio coding quantization control method combining code rate layering and quality layering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011105481.4A CN112420059B (en) 2020-10-15 2020-10-15 Audio coding quantization control method combining code rate layering and quality layering

Publications (2)

Publication Number Publication Date
CN112420059A true CN112420059A (en) 2021-02-26
CN112420059B CN112420059B (en) 2022-04-19

Family

ID=74854821

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011105481.4A Active CN112420059B (en) 2020-10-15 2020-10-15 Audio coding quantization control method combining code rate layering and quality layering

Country Status (1)

Country Link
CN (1) CN112420059B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050267743A1 (en) * 2004-05-28 2005-12-01 Alcatel Method for codec mode adaptation of adaptive multi-rate codec regarding speech quality
CN101202912A (en) * 2007-11-30 2008-06-18 上海广电(集团)有限公司中央研究院 Method for controlling balanced code rate and picture quality code rate
CN101668197A (en) * 2009-09-18 2010-03-10 浙江大学 Code rate control method in scalable video coding based on linear model
US20120140650A1 (en) * 2010-12-03 2012-06-07 Telefonaktiebolaget Lm Bandwidth efficiency in a wireless communications network
CN102496369A (en) * 2011-12-23 2012-06-13 中国传媒大学 Objective assessment method for audio quality of compressed domain based on distortion correction
CN110267045A (en) * 2019-08-07 2019-09-20 杭州微帧信息科技有限公司 A kind of method, apparatus and readable storage medium storing program for executing that video is handled and encoded
US20200314698A1 (en) * 2016-09-23 2020-10-01 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Wireless Communication Method and Device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050267743A1 (en) * 2004-05-28 2005-12-01 Alcatel Method for codec mode adaptation of adaptive multi-rate codec regarding speech quality
CN101202912A (en) * 2007-11-30 2008-06-18 上海广电(集团)有限公司中央研究院 Method for controlling balanced code rate and picture quality code rate
CN101668197A (en) * 2009-09-18 2010-03-10 浙江大学 Code rate control method in scalable video coding based on linear model
US20120140650A1 (en) * 2010-12-03 2012-06-07 Telefonaktiebolaget Lm Bandwidth efficiency in a wireless communications network
CN102496369A (en) * 2011-12-23 2012-06-13 中国传媒大学 Objective assessment method for audio quality of compressed domain based on distortion correction
US20200314698A1 (en) * 2016-09-23 2020-10-01 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Wireless Communication Method and Device
CN110267045A (en) * 2019-08-07 2019-09-20 杭州微帧信息科技有限公司 A kind of method, apparatus and readable storage medium storing program for executing that video is handled and encoded

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
柏森等: "基于CABAC的视觉质量可控的H.264视频感知加密算法", 《电子与信息学报》 *

Also Published As

Publication number Publication date
CN112420059B (en) 2022-04-19

Similar Documents

Publication Publication Date Title
CN105357591B (en) A kind of QoE monitoring of self-adaption code rate net cast and optimization method
US20200275171A1 (en) Method and system for providing media content to a client
CA2300495C (en) Technique for multi-rate coding of a signal containing information
CN102985967B (en) Adaptive audio transcoding
CN102137047B (en) Multiparameter media adapter gateway and adaption method thereof
CN102624712B (en) Method for optimizing quality of service of wireless streaming media and device
CN104768026A (en) Multi-channel audio and video transcoding device
WO2017084277A1 (en) Code stream self-adaption method and system for online media service
CN112420059B (en) Audio coding quantization control method combining code rate layering and quality layering
US20230262232A1 (en) Video coding method and apparatus, computer-readable medium and electronic device
CN103986792B (en) Group membership information synchronizing method, server and group membership information synchronizing system
CN102724502B (en) The control method of code check and device in a kind of Video coding
CN115462053A (en) Determining an initial bit rate for real-time communications
Farahani et al. CSDN: CDN-aware QoE optimization in SDN-assisted HTTP adaptive video streaming
Le et al. A novel adaptation method for HTTP streaming of VBR videos over mobile networks
US7228535B2 (en) Methods and apparatus for multimedia stream scheduling in resource-constrained environment
CN112383775A (en) Video data transmission method based on cloud mobile phone
CN104410868A (en) Methods for rapid aggregation and reading of multiple files of shared-file system
Khan et al. What happens when stochastic adaptive video streaming players share a bottleneck link?
Mueller et al. Context-aware video encoding as a network-based media processing (NBMP) workflow
CN108900851A (en) The transmission method of file, the playback method and device of video are described
CN108897497A (en) A kind of acentric data managing method and device
CN113645228A (en) Code rate self-adaptive video distribution method and system
Kim et al. A modification of the fuzzy logic based dash adaptation scheme for performance improvement
CN101163255B (en) Local strategy control method for resource preservation using fuzzy theory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant