CN105989853B - Audio quality evaluation method and system - Google Patents

Audio quality evaluation method and system Download PDF

Info

Publication number
CN105989853B
CN105989853B CN201510091491.XA CN201510091491A CN105989853B CN 105989853 B CN105989853 B CN 105989853B CN 201510091491 A CN201510091491 A CN 201510091491A CN 105989853 B CN105989853 B CN 105989853B
Authority
CN
China
Prior art keywords
audio
frequency
audio quality
amplitude
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510091491.XA
Other languages
Chinese (zh)
Other versions
CN105989853A (en
Inventor
杨将
章继东
吴维昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN201510091491.XA priority Critical patent/CN105989853B/en
Publication of CN105989853A publication Critical patent/CN105989853A/en
Application granted granted Critical
Publication of CN105989853B publication Critical patent/CN105989853B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses an audio quality evaluation method and system, and belongs to the technical field of voice signal processing. The audio quality evaluation method comprises the following steps: receiving audio data input by a user; transcoding the audio data to obtain a plurality of audio sampling point data; respectively calculating the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum of the audio sampling point data; and calculating an audio quality score according to the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum. The audio quality evaluation method integrates a plurality of audio quality parameters to evaluate the audio quality, has strong evaluation result universality and can meet the requirements of most application occasions.

Description

Audio quality evaluation method and system
Technical Field
The invention relates to the technical field of voice signal processing, in particular to an audio quality evaluation method and system.
Background
With the continuous development of digital audio processing technology, the requirement for audio quality is higher and higher. How to determine a suitable evaluation standard to evaluate the audio quality so as to obtain an audio meeting the requirements becomes an important current subject.
In the initial stage of audio technology development, because there is no objective evaluation standard, a large number of audio auditors are usually organized to evaluate and score the audio quality by manually auditioning various audios. The mode has high economic cost and long experimental period due to the fact that a large number of personnel are needed to participate, subjectivity is achieved based on the favor and the standard of each evaluating personnel, and unification is difficult, so that the objectivity and the accuracy of an evaluating result cannot be guaranteed.
With the development of the technology, people summarize indexes with large influence on audio quality to form audio quality parameters as objective evaluation standards of the audio quality. Typical audio quality parameters are mainly: clipping, loudness, signal-to-noise ratio, and noise energy. In practical application, one of the audio quality parameters is selected according to different audio use occasions, and the audio quality under the occasion is evaluated. The evaluation mode only considers a certain index of the audio quality, and the evaluation result only has reference significance for a specific occasion, but cannot be applied to other occasions, so that the universality is poor.
Disclosure of Invention
The embodiment of the invention provides an audio quality evaluation method and system, which are used for evaluating the audio quality by integrating a plurality of audio quality parameters, have strong evaluation result universality and can meet the requirements of most application occasions.
The technical scheme provided by the embodiment of the invention is as follows:
in one aspect, a method for evaluating audio quality is provided, including:
receiving audio data input by a user;
transcoding the audio data to obtain a plurality of audio sampling point data;
respectively calculating the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum of the audio sampling point data;
and calculating an audio quality score according to the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum.
Preferably, the calculating the upper frequency of the spectrum includes:
performing framing processing on the audio sampling point data;
calculating the upper frequency limit of the frequency spectrum of each frame of audio data;
counting a frequency histogram of the upper limit frequency of the frequency spectrum in a full frequency band, and selecting a frequency band range with a preset width from the frequency histogram with the maximum upper limit frequency of the frequency spectrum;
determining a central frequency point of each frequency histogram in the frequency band range with the preset width, and counting the occurrence frequency of the central frequency point;
taking the times as weighting coefficients, and calculating a weighted average value of the central frequency points;
and taking the weighted average value as the upper frequency of the frequency spectrum.
Preferably, the calculating the average loudness includes:
extracting a layer of envelope from the amplitude absolute value of the audio sampling point data to obtain a layer of envelope audio amplitude absolute value data;
extracting a layer of envelope from the one-layer envelope audio amplitude absolute value data to obtain a two-layer envelope audio amplitude absolute value data;
calculating average amplitude values of the one-layer envelope audio amplitude absolute value data and the two-layer envelope audio amplitude absolute value data;
and taking the average amplitude value as the average loudness.
Preferably, the calculating an audio quality score according to the pop amplitude-cut point proportion, the average loudness, the signal-to-noise ratio, the noise energy, and the upper frequency spectrum limit frequency includes:
acquiring a preset initial audio quality score;
determining the cumulative deduction of the abnormal conditions of the five audio quality parameters according to the calculated proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the frequency spectrum upper limit frequency;
calculating the difference value between the initial audio quality score and the accumulated deduction;
and taking the difference value as the audio quality score.
Preferably, the method further comprises:
determining an audio quality grade according to the audio quality score;
and taking the audio quality grade as an evaluation result.
In another aspect, an audio quality evaluation system is provided, including:
the receiving module is used for receiving audio data input by a user;
the transcoding module is used for transcoding the audio data to obtain a plurality of audio sampling point data;
the first calculation module is used for calculating the proportion of the crackle amplitude-cutting point, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum of the audio sampling point data respectively;
and the second calculation module is used for calculating the audio quality score according to the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the frequency spectrum upper limit frequency which are calculated by the first calculation module.
Preferably, the first calculation module comprises a first calculation submodule for calculating the upper limit frequency of the spectrum; the first computation submodule includes:
the framing unit is used for framing the audio sampling point data;
a first calculation unit for calculating an upper limit frequency of a spectrum of each frame of audio data;
the first statistic unit is used for counting a frequency histogram of the upper limit frequency of the frequency spectrum in a full frequency band;
the selection unit is used for selecting a frequency band range with a preset width from the frequency histogram with the maximum frequency of the upper limit of the frequency spectrum counted by the first statistic unit;
the determining unit is used for determining a central frequency point of each time histogram in the frequency band range with the preset width;
a second counting unit configured to count the number of occurrences of the center frequency point determined by the determining unit;
and the second calculating unit is used for calculating the weighted average value of the central frequency point by taking the frequency of occurrence of the central frequency point counted by the second counting unit as a weighting coefficient, and taking the weighted average value as the upper limit frequency of the frequency spectrum.
Preferably, the first calculation module further comprises a second calculation submodule for calculating the average loudness; the second computation submodule includes:
the first extraction unit is used for extracting a layer of envelope from the amplitude absolute value of the audio sampling point data to obtain a layer of envelope audio amplitude absolute value data;
the second extraction unit is used for extracting a layer of envelope from the layer of envelope audio amplitude absolute value data to obtain a layer of envelope audio amplitude absolute value data;
a third calculating unit configured to calculate an average amplitude value of the one-layer envelope audio amplitude absolute value data and the two-layer envelope audio amplitude absolute value data, and take the average amplitude value as the average loudness.
Preferably, the second calculation module includes:
the acquisition unit is used for acquiring a preset initial audio quality score;
the fourth calculating unit is used for determining the accumulated deduction of the abnormal conditions of the five audio quality parameters according to the calculated plosive amplitude-interception point proportion, the average loudness, the signal-to-noise ratio, the noise energy and the spectrum upper limit frequency;
and the fifth calculating unit is used for calculating the difference value between the initial audio quality score and the accumulated deduction score and taking the difference value as the audio quality score.
Preferably, the system further comprises: and the determining module is used for determining the audio quality grade according to the audio quality score and taking the audio quality grade as an evaluation result.
The audio quality evaluation method and the system provided by the embodiment of the invention transcode the audio data input by a user to obtain a plurality of audio sampling point data, and calculate five audio quality parameters of the audio sampling point data, such as the proportion of the plosive amplitude-intercept point, the average loudness, the signal-to-noise ratio, the noise energy and the frequency spectrum upper limit frequency, so as to obtain the audio quality score according to the audio quality parameters. According to the audio quality evaluation method and the system, since a plurality of audio quality parameters are integrated to evaluate the audio quality, the evaluation result has strong universality and can meet the requirements of most application occasions.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a flowchart of an audio quality evaluation method according to an embodiment of the present invention;
fig. 2 is a flowchart of a method for calculating an upper limit frequency of a spectrum according to an embodiment of the present invention;
fig. 3 is a flowchart of a method for calculating average loudness according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for calculating an audio quality score according to an embodiment of the present invention;
FIG. 5 is a flow chart of another audio quality assessment method according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of an audio quality evaluation system according to an embodiment of the present invention;
FIG. 7 is a schematic structural diagram of a second audio quality evaluation system according to an embodiment of the present invention;
FIG. 8 is a schematic structural diagram of a third audio quality evaluation system according to an embodiment of the present invention;
FIG. 9 is a schematic structural diagram of a fourth audio quality evaluation system according to an embodiment of the present invention;
fig. 10 is a schematic structural diagram of a fifth audio quality evaluation system according to an embodiment of the present invention.
Detailed Description
In order to make the technical field of the invention better understand the scheme of the embodiment of the invention, the embodiment of the invention is further described in detail with reference to the drawings and the implementation mode.
An embodiment of the present invention provides an audio quality evaluation method, as shown in fig. 1, including:
step 101: audio data input by a user is received.
Step 102: and transcoding the audio data to obtain a plurality of audio sampling point data.
Specifically, the transcoding tool can be used to transcode the audio data according to the format requirement of the output audio data and the selection of the transcoding tool for the coding information carried in the user input audio data. For example, ffmpeg can be selected as a transcoding tool to transcode the audio data, and the transcoded audio format is a wav format with a single channel, a sampling rate of 16k and a sampling precision of 16 bits. Since the ffmpeg transcoding tool can automatically analyze the format information of the original audio, input audio data of any format can be supported. In use, the audio format to be output can be predetermined, and the input audio data can be converted into the determined audio format for output by using the ffmpeg.
Step 103: and respectively calculating the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the upper limit frequency of the spectrum of the audio sampling point data.
The proportion of the popping clipping points refers to the ratio of the number of sampling points in the audio sampling point data, of which the amplitude value exceeds the amplitude clipping threshold value, to the number of all sampling points. The smaller the proportion of the crackle amplitude cutting point is, the higher the audio quality is. An audio amplitude clipping threshold η is preset (for example, η is set to 3000), the number of sampling points whose amplitude values exceed the amplitude clipping threshold η in the audio sampling point data obtained in the statistical step 102 is sum1, and the number of all sampling points is sum, so that the pop amplitude clipping ratio α is sum 1/sum.
Loudness refers to the degree of sound perceived by the human ear, i.e., how loud the sound is. The loudness is mainly dependent on the sound intensity and also related to the frequency of the sound. In the embodiment of the present invention, the amplitude value is used to represent the loudness size, and more specifically, the average amplitude value is used to represent the average loudness size, wherein the average loudness value can be calculated by extracting an envelope from the absolute value of the amplitude of the audio sample point data.
As shown in fig. 3, a flowchart of a method for calculating average loudness according to an embodiment of the present invention includes the following steps:
step 301: and extracting a layer of envelope from the amplitude absolute value of the audio sampling point data to obtain a layer of envelope audio amplitude absolute value data.
The one-layer envelope audio amplitude absolute value data is the integration of the maximum value of the sampling point, for example, the beta 1 is adopted for representing, and the one-layer envelope audio amplitude absolute value data can be directly used as the average loudness in the occasion with low requirement on the calculation accuracy of the average loudness.
Step 302: and extracting a layer of envelope from the one-layer envelope audio amplitude absolute value data to obtain a two-layer envelope audio amplitude absolute value data.
In order to improve the calculation accuracy of the average loudness, a layer of envelope may be further extracted from the layer of envelope audio amplitude absolute value data to obtain a layer of envelope audio amplitude absolute value data, which is expressed by β 2, for example.
Step 303: and calculating the average amplitude value of the one-layer envelope audio amplitude absolute value data and the two-layer envelope audio amplitude absolute value data, and taking the average amplitude value as the average loudness.
By calculating the average amplitude value β of the one-layer envelope audio amplitude absolute value data β 1 and the two-layer envelope audio amplitude absolute value data β 2, that is, β ═ β 1+ β 2)/2, and taking β as the average loudness, the calculation accuracy of the average loudness can be ensured.
The signal-to-noise ratio (snr) is a parameter describing the ratio of the effective component to the noise component in a signal, expressed as γ, the larger the snr, the smaller the noise. Since noise exists in the form of waves and has certain energy, the noise energy can be used as one of the parameters for measuring the audio quality, wherein the noise energy can be represented by E. The signal-to-noise ratio γ and the noise energy E can be conveniently calculated by a person skilled in the art using common calculation tools, such as speedx tools.
The frequency spectrum refers to the distribution curve of the frequency. The frequency spectrum of the audio signal covers a band range of a certain width according to the frequency. Since the audio signal is not valid in all frequency bands, for example, the audio signal in the bass frequency band, which is generally in a narrow range, is valid for a conventional unprocessed audio signal, and the audio signal in the treble frequency band, which is in a wide coverage frequency band, is not valid. That is, there exists a frequency value in the audio spectrum that is the highest frequency value that is actually valid in the audio spectrum, which we refer to as the upper spectral limit frequency. The audio frequency spectrum can be divided into two parts of effective audio and ineffective audio through the upper frequency limit of the frequency spectrum, specifically, the audio signal above the upper frequency limit of the frequency spectrum is ineffective, but the audio signal value below the upper frequency limit of the frequency spectrum is effective, and is represented on the spectrogram, the audio signal above the upper frequency limit of the frequency spectrum is represented by dark color, and the audio signal below the upper frequency limit of the frequency spectrum is represented by bright color. The audio signal is divided into valid audio and invalid audio, which do not mean that the audio signal is absolutely valid or invalid, but according to the influence of the valid audio and the invalid audio on the audio quality, the influence of the invalid audio on the audio quality is considered to be small and can be ignored, and the influence of the valid audio on the audio quality is large, so that the audio quality needs to be evaluated in an important mode in audio quality evaluation. Before the concept of upper frequency of a frequency spectrum is not introduced, when audio quality evaluation is performed on an audio signal, sampling and analysis need to be performed on the audio signal in a full frequency band according to the same sampling frequency, and because a range covered by an invalid frequency band in the full frequency band range is often wider than an effective frequency band range, a larger part of sampled data is data in the invalid frequency band, so that data which is analyzed with a great deal of effort basically has no reference meaning to actual audio quality, that is, much useless work is performed. In addition, the whole frequency band range of the audio signal is wide, so that the sampling precision is low under the condition of processing the same data quantity, the overall calculation precision is influenced, and errors are easy to occur in the process of processing data in a large number of invalid frequency bands, so that errors are caused in the overall evaluation result. After the upper limit frequency of a frequency spectrum is introduced, a full frequency band range is divided into effective audios and invalid audios, only the effective audios in audio signals are sampled and analyzed, the invalid audios are not processed, the frequency bandwidth covered by the invalid audios is far larger than the width of the effective audios, so that the data processing amount is greatly reduced, the error probability of processing the invalid audios is reduced, in addition, the sampling precision can be greatly improved due to the small processing bandwidth, and the more accurate evaluation result is ensured. For example, the frequency band range of the audio signal is 0-20 kHz, wherein the upper limit frequency of the frequency spectrum is 4kHz, that is, 0-4 kHz is an effective frequency band, and 4 k-16 kHz is an ineffective frequency band, after the upper limit frequency of the frequency spectrum is introduced, the frequency band with the width of 0-4 kHz is intercepted to perform audio quality evaluation, and the data volume required to be processed for performing audio quality evaluation on the frequency band with the width of 0-20 kHz is obviously reduced, so that a higher sampling frequency can be adopted for sampling within the frequency band range of 0-4 kHz, and the processing result is more accurate.
As shown in fig. 2, a flowchart of a method for calculating an upper limit frequency of a spectrum according to an embodiment of the present invention includes:
step 201: and performing frame processing on the audio sampling point data.
Specifically, the framing processing may be performed in a conventional manner, for example, the framing processing function provided by Matlab is used to perform framing processing on the audio sampling point data, so as to obtain multi-frame audio data.
Step 202: the upper frequency of the spectrum of each frame of audio data is calculated.
After audio sampling point data are subjected to frame division processing to obtain multiple frames of audio data, the upper limit frequency fi. of the frequency spectrum of each frame of audio data can be respectively calculated, the frequency range of the frequency spectrum is assumed to be 0-PHz, then a maximum frequency point THz exists in the frequency range of each frame of audio data, the frequency spectrum value of the frame of audio data in the range of T-PHz is smaller than or equal to a threshold value ξ, wherein ξ is a value close to 0
Figure BDA0000676209700000082
According to the definition of the upper limit frequency of the frequency spectrum, the central frequency of the frequency band with the first spectral amplitude value larger than a certain set threshold value, for example, ξ being 0.3, is calculated in the order from the larger to the smaller of the central frequency of the frequency band, that is, the upper limit frequency f of the frame of audio data is the upper limit frequency fi
Step 203: counting the frequency histogram of the upper limit frequency of the frequency spectrum in the whole frequency band, and selecting the frequency band range with the preset width from the frequency histogram with the maximum upper limit frequency of the frequency spectrum.
Based on the calculated upper limit frequency fi (i ═ 0,1, … M) of each frame of data, where M is the total number of frames, the number of times Cn (N ═ 0,1 …, N) that the upper limit frequency fi is Fn appears is counted over N frequency bands, a frequency histogram in the full frequency band range is created, a frequency band range of a predetermined width is selected from the frequency histogram of the maximum upper limit frequency, for example, a frequency band of SHz is selected from 8kHz down, and generally, the frequency band width is preferably selected to be S ═ 1k by parameter tuning on the audio test set.
Step 204: and determining a central frequency point of each frequency histogram in a frequency band range with a preset width, and counting the occurrence frequency of the central frequency point.
For each frequency histogram, the central frequency point Fn is easily determined, so that the total frequency To of the upper limit frequency fi-Fn in the preset width frequency band is easily counted
Step 205: and taking the frequency as a weighting coefficient, calculating a weighted average value of the central frequency point, and taking the weighted average value as the upper limit frequency of the frequency spectrum.
Smoothing from the high frequency part to the low frequency part with S1 kHz width as the frequency band, and calculating new upper limit frequency f in the frequency band with the smoothing length of 8k/N Hzi=FnTotal number of occurrences T1Sequentially calculating Tj(j is 0,1, … W), since the frequency band starts from 0 to 1kHz, the width of the frequency band is ensured to be 1kHz, the frequency band starting frequency is sequentially increased by a smoothing length of 8k/N, the final frequency band is 7kHz to 8kHz, the starting frequency of the frequency band forms an arithmetic progression with an initial value of 0, a final value of 7k and a step size of 8k/N,
Figure BDA0000676209700000091
calculating Tmax=max{TjI (j ═ 0, 1.., W) } center frequencies F of all the minimum frequency bands of 8k/N width among the corresponding frequency bandsnWeighted average of (2), the weight coefficient being CnThe result is the upper frequency f of the spectrum of the final audio data.
Step 104: and calculating the audio quality score according to the proportion of the crackle amplitude cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the spectrum upper limit frequency.
As shown in fig. 4, a flowchart of a method for calculating an audio quality score according to an embodiment of the present invention includes:
step 401: and acquiring a preset initial audio quality score.
The initial audio quality score may be set in advance according to a scoring rule, for example, in the case of adopting a percentile scoring rule, the initial audio quality score may be set to 100 points.
Step 402: and determining the accumulated deduction of the abnormal conditions of the five audio quality parameters according to the calculated plosive amplitude-cutting point proportion, average loudness, signal-to-noise ratio, noise energy and spectrum upper limit frequency.
After five audio quality parameters including the plosive intercept point ratio alpha, the average loudness beta, the signal-to-noise ratio gamma, the noise energy E and the spectrum upper limit frequency f are calculated in the step 103, the deduction condition under the condition that the five audio quality parameters are abnormal can be respectively determined according to a preset deduction rule, so that the accumulated deduction under the condition that the five audio quality parameters are abnormal is calculated.
The scoring condition of the audio quality under various different use occasions can be obtained by collecting a large amount of audio data as samples for statistics and analysis, and then the scoring condition under the abnormal condition of each audio quality parameter (the scoring condition under the normal condition of the audio quality parameter is 0) is comprehensively obtained, so that the scoring result can be suitable for the application requirements of most occasions. In order to ensure that the evaluation result is more accurate, the corresponding deduction score can be set to be different values when each audio quality parameter is in different numerical value ranges.
For example: (1) when alpha is less than or equal to 0.006, the deduction is set to 0; when 0.006 < alpha < 0.01, setting the deduction to be A < 1 >; when alpha is greater than or equal to 0.01, the deduction is set to be A2, wherein A1 is less than A2;
(2) when beta is larger than 1200, the deduction is set to 0; when beta is more than or equal to 1000 and less than or equal to 1200, the deduction is set as B1; when beta is less than 1000, the deduction is set to be B2, wherein B1 is less than B2;
(3) when gamma is larger than 16.8, the deduction is set to be 0; when gamma is more than or equal to 15.5 and less than or equal to 16.8, the deduction is set as C1; when gamma is more than or equal to 13.5 and less than 15.5, the deduction is set as C2; when gamma is less than 13.5, the deduction is set to C3, where C1 < C2 < C3;
(4) when E is less than 43.43, the deduction is set to be 0; when E is more than or equal to 43.43 and gamma is less than 52.55, the deduction is set as D1; when E is more than or equal to 43.43 and gamma is more than or equal to 52.55 and less than 54.55, the deduction is set as D2; when E is equal to or greater than 43.43 and gamma is equal to or greater than 54.55, the deduction is set to be D3, wherein D1 is less than D2 is less than D3;
(5) when f is less than 6000 or f is more than or equal to 7000, setting the deduction as 0; when f is more than or equal to 6000 and less than 7000 and gamma is more than or equal to 6000, setting the deduction as E1; when f is 6000 or more < 7000 and gamma is 5000 or more < 6000, the deduction is set to E2; when f is 6000 or more < 7000 and gamma is < 5000, the deduction is set to E3, where E1 < E2 < E3.
The scores A1, A2, B1, B2, C1, C2, C3, D1, D2, D3, E1, E2 and E3 are obtained according to statistical data, and are set in advance, and different scores are selected according to the numerical values of the audio quality parameters, wherein A2 + B2 + C3 + D3 + E3 is less than or equal to a preset initial quality score, so that the finally calculated audio quality score is more than or equal to 0.
And (3) evaluating the audio quality by respectively taking the signal-to-noise ratio gamma as an independent evaluation parameter according to the influence of each audio quality parameter on the audio quality, combining the signal-to-noise ratio gamma with the noise energy E and combining the signal-to-noise ratio gamma with the upper frequency f of the frequency spectrum as an evaluation parameter, and when the signal-to-noise ratio gamma meets the conditions in (3), (4) and (5), respectively and independently deducting each item.
According to the five audio quality parameters of the pop amplitude-cutting point proportion alpha, the average loudness beta, the signal-to-noise ratio gamma, the noise energy E and the spectrum upper limit frequency f, deductions corresponding to the parameters can be calculated respectively, the deductions are summed, and the accumulated deductions under the condition that the five audio quality parameters are abnormal can be calculated.
Step 403: and calculating the difference value of the initial audio quality score and the accumulated deduction score, and taking the difference value as the audio quality score.
In the embodiment of the invention, the difference value of the initial audio quality score and the accumulated deduction under the abnormal condition of each audio quality parameter is used as the audio quality score, and the influence of five audio quality parameters including the proportion alpha of the plosive amplitude-cutting point, the average loudness beta, the signal-to-noise ratio gamma, the noise energy E and the spectrum upper limit frequency f on the audio quality is fully considered, so that the evaluation result can be ensured to meet the application requirements of most occasions.
As shown in fig. 5, the audio quality evaluation method may further include:
step 105: and determining an audio quality grade according to the audio quality score, and taking the audio quality grade as an evaluation result.
The rating standard can be preset, after the audio quality score is calculated, the audio quality level is further determined according to the audio quality score, and the audio quality level is fed back to the user as a final evaluation result.
For example, the audio quality level is set to five levels, and when 0 ≦ score ≦ 20, the corresponding audio quality level is one level; when score is more than 20 and less than or equal to 40, the corresponding audio quality level is two-level; when score is more than 40 and less than or equal to 60, the corresponding audio quality level is three-level; when score is more than 60 and less than or equal to 80, the corresponding audio quality level is four levels; and when score is more than 80 and less than or equal to 100, the corresponding audio quality level is five. Wherein, a higher level represents a higher singing level of the user.
The audio quality evaluation method provided by the embodiment of the invention has the advantages that the audio data input by a user are transcoded to obtain a plurality of audio sampling point data, a maximum frequency point THz exists in a frequency range by calculating five audio quality parameters of the audio sampling point data, namely the plosive amplitude-intercept point proportion, the average loudness, the signal-to-noise ratio, the noise energy and the spectrum upper limit frequency, so that the frequency spectrum values of the frame of audio data in the range of T-PHz are all counted, and the audio quality score is calculated according to the audio quality parameters. According to the audio quality evaluation method and the system, since a plurality of audio quality parameters are integrated to evaluate the audio quality, the evaluation result has strong universality and can meet the requirements of most application occasions.
Correspondingly, an embodiment of the present invention further provides an audio quality evaluation system, as shown in fig. 6, including:
a receiving module 501, configured to receive audio data input by a user;
a transcoding module 502, configured to transcode the audio data to obtain multiple audio sampling point data;
the first calculating module 503 is configured to calculate a plosive amplitude-cut point ratio, an average loudness, a signal-to-noise ratio, noise energy, and a spectrum upper limit frequency of the audio sampling point data, respectively;
the second calculating module 504 is configured to calculate an audio quality score according to the plosive amplitude-cut ratio, the average loudness, the signal-to-noise ratio, the noise energy, and the spectrum upper limit frequency calculated by the first calculating module 503.
As shown in fig. 7, the first calculating module 503 includes a first calculating submodule 601 for calculating the upper limit frequency of the spectrum; one specific structure of the first computing submodule 601 includes:
a framing unit 701, configured to perform framing processing on the audio sample data;
a first calculation unit 702 for calculating an upper limit frequency of a spectrum of each frame of audio data;
a first statistical unit 703, configured to count a frequency histogram of an upper limit frequency of a spectrum in a full frequency band;
a selecting unit 704, configured to select a frequency band range with a preset width from the frequency histogram with the maximum upper frequency limit of the frequency spectrum counted by the first counting unit 703;
a determining unit 705 configured to determine a center frequency point of each histogram of times within a frequency band range of a preset width;
a second counting unit 706 configured to count the number of occurrences of the center frequency point determined by the determining unit 705;
a second calculation unit 707 for calculating a weighted average of the center frequency points with the number of occurrences of the center frequency point counted by the second counting unit 706 as a weighting coefficient, and taking the weighted average as the upper limit frequency of the spectrum.
As shown in fig. 8, the first calculating module 503 further includes a second calculating submodule 602 for calculating the average loudness; one specific structure of the second computing submodule 602 includes:
a first extracting unit 708, configured to extract a layer of envelope from the amplitude absolute value of the audio sample point data to obtain a layer of envelope audio amplitude absolute value data;
a second extracting unit 709, configured to extract a layer of envelope from the first-layer envelope audio amplitude absolute value data to obtain a second-layer envelope audio amplitude absolute value data;
a third calculating unit 710, configured to calculate an average amplitude value of the one-layer envelope audio amplitude absolute value data and the two-layer envelope audio amplitude absolute value data, and take the average amplitude value as an average loudness.
As shown in fig. 9, the second calculating module 504 includes:
an obtaining unit 801, configured to obtain a preset initial audio quality score;
a fourth calculating unit 802, configured to determine cumulative scores of the five abnormal audio quality parameters according to the calculated pop amplitude intercept ratio, average loudness, signal-to-noise ratio, noise energy, and spectrum upper limit frequency;
a fifth calculating unit 803, configured to calculate a difference between the initial audio quality score and the cumulative score, and use the difference as the audio quality score.
As shown in fig. 10, the audio quality evaluation system further includes:
and the determining module 505 is configured to determine an audio quality grade according to the audio quality score, and use the audio quality grade as an evaluation result.
The audio quality evaluation system provided by the embodiment of the invention transcodes audio data input by a user to obtain a plurality of audio sampling point data, and calculates the audio quality scores according to the audio quality parameters by calculating five audio quality parameters of the audio sampling point data, such as the proportion of the plosive amplitude-intercept point, the average loudness, the signal-to-noise ratio, the noise energy and the spectrum upper limit frequency. According to the audio quality evaluation method and the system, since a plurality of audio quality parameters are integrated to evaluate the audio quality, the evaluation result has strong universality and can meet the requirements of most application occasions.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, they are described in a relatively simple manner, and reference may be made to some descriptions of method embodiments for relevant points. The above-described system embodiments are merely illustrative, wherein the modules or units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. An audio quality evaluation method is characterized by comprising the following steps:
receiving audio data input by a user;
transcoding the audio data to obtain a plurality of audio sampling point data;
respectively calculating the proportion of the plosive amplitude-clipping points, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum of the audio sampling point data, wherein the smaller the proportion of the plosive amplitude-clipping points is, the higher the audio quality is; extracting envelope from the amplitude absolute values of the audio sampling points layer by layer to calculate the average loudness, wherein the average loudness is divided into two layers layer by layer;
and calculating an audio quality score according to the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the upper frequency limit of the frequency spectrum.
2. The audio quality assessment method according to claim 1, wherein the calculating the upper frequency of the spectrum comprises:
performing framing processing on the audio sampling point data;
calculating the upper frequency limit of the frequency spectrum of each frame of audio data;
counting a frequency histogram of the upper limit frequency of the frequency spectrum in a full frequency band, and selecting a frequency band range with a preset width from the frequency histogram with the maximum upper limit frequency of the frequency spectrum;
determining a central frequency point of each frequency histogram in the frequency band range with the preset width, and counting the occurrence frequency of the central frequency point;
taking the times as weighting coefficients, and calculating a weighted average value of the central frequency points;
and taking the weighted average as the upper frequency of the frequency spectrum of the audio data.
3. The audio quality assessment method according to claim 1, wherein said calculating an average loudness comprises:
extracting a layer of envelope from the amplitude absolute value of the audio sampling point data to obtain a layer of envelope audio amplitude absolute value data;
extracting a layer of envelope from the one-layer envelope audio amplitude absolute value data to obtain a two-layer envelope audio amplitude absolute value data;
calculating average amplitude values of the one-layer envelope audio amplitude absolute value data and the two-layer envelope audio amplitude absolute value data;
and taking the average amplitude value as the average loudness.
4. The audio quality evaluation method according to claim 1, wherein the calculating an audio quality score according to the plosive intercept point proportion, the average loudness, the signal-to-noise ratio, the noise energy, and the spectral ceiling frequency comprises:
acquiring a preset initial audio quality score;
determining the cumulative deduction of the abnormal conditions of the five audio quality parameters according to the calculated proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the frequency spectrum upper limit frequency;
calculating the difference value between the initial audio quality score and the accumulated deduction;
and taking the difference value as the audio quality score.
5. The audio quality assessment method according to any one of claims 1 to 4, further comprising:
determining an audio quality grade according to the audio quality score;
and taking the audio quality grade as an evaluation result.
6. An audio quality evaluation system, comprising:
the receiving module is used for receiving audio data input by a user;
the transcoding module is used for transcoding the audio data to obtain a plurality of audio sampling point data;
the first calculation module is used for calculating the proportion of a plosive amplitude cut point, the average loudness, the signal-to-noise ratio, the noise energy and the upper limit frequency of a frequency spectrum of the audio sampling point data respectively, wherein the smaller the proportion of the plosive amplitude cut point is, the higher the audio quality is; extracting envelope from the amplitude absolute values of the audio sampling points layer by layer to calculate the average loudness, wherein the average loudness is divided into two layers layer by layer;
and the second calculation module is used for calculating the audio quality score according to the proportion of the plosive amplitude-cutting points, the average loudness, the signal-to-noise ratio, the noise energy and the frequency spectrum upper limit frequency which are calculated by the first calculation module.
7. The audio quality evaluation system according to claim 6, wherein the first calculation module comprises a first calculation submodule for calculating an upper frequency of a spectrum; the first computation submodule includes:
the framing unit is used for framing the audio sampling point data;
a first calculation unit for calculating an upper limit frequency of a spectrum of each frame of audio data;
the first statistic unit is used for counting a frequency histogram of the upper limit frequency of the frequency spectrum in a full frequency band;
the selection unit is used for selecting a frequency band range with a preset width from the frequency histogram with the maximum frequency of the upper limit of the frequency spectrum counted by the first statistic unit;
the determining unit is used for determining a central frequency point of each time histogram in the frequency band range with the preset width;
a second counting unit configured to count the number of occurrences of the center frequency point determined by the determining unit;
and the second calculating unit is used for calculating the weighted average value of the central frequency point by taking the frequency of the occurrence of the central frequency point counted by the second counting unit as a weighting coefficient, and taking the weighted average value as the upper limit frequency of the frequency spectrum of the audio data.
8. The audio quality assessment system according to claim 6, wherein said first calculation module further comprises a second calculation submodule for calculating said average loudness; the second computation submodule includes:
the first extraction unit is used for extracting a layer of envelope from the amplitude absolute value of the audio sampling point data to obtain a layer of envelope audio amplitude absolute value data;
the second extraction unit is used for extracting a layer of envelope from the layer of envelope audio amplitude absolute value data to obtain a layer of envelope audio amplitude absolute value data;
a third calculating unit configured to calculate an average amplitude value of the one-layer envelope audio amplitude absolute value data and the two-layer envelope audio amplitude absolute value data, and take the average amplitude value as the average loudness.
9. The audio quality assessment system according to claim 6, wherein said second calculation module comprises:
the acquisition unit is used for acquiring a preset initial audio quality score;
the fourth calculating unit is used for determining the accumulated deduction of the abnormal conditions of the five audio quality parameters according to the calculated plosive amplitude-interception point proportion, the average loudness, the signal-to-noise ratio, the noise energy and the spectrum upper limit frequency;
and the fifth calculating unit is used for calculating the difference value between the initial audio quality score and the accumulated deduction score and taking the difference value as the audio quality score.
10. The audio quality assessment system according to any one of claims 6 to 9, further comprising: and the determining module is used for determining the audio quality grade according to the audio quality score and taking the audio quality grade as an evaluation result.
CN201510091491.XA 2015-02-28 2015-02-28 Audio quality evaluation method and system Active CN105989853B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510091491.XA CN105989853B (en) 2015-02-28 2015-02-28 Audio quality evaluation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510091491.XA CN105989853B (en) 2015-02-28 2015-02-28 Audio quality evaluation method and system

Publications (2)

Publication Number Publication Date
CN105989853A CN105989853A (en) 2016-10-05
CN105989853B true CN105989853B (en) 2020-08-18

Family

ID=57039274

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510091491.XA Active CN105989853B (en) 2015-02-28 2015-02-28 Audio quality evaluation method and system

Country Status (1)

Country Link
CN (1) CN105989853B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106792346A (en) * 2016-11-14 2017-05-31 广东小天才科技有限公司 Audio regulation method and device in a kind of instructional video
CN107221343B (en) * 2017-05-19 2020-05-19 北京市农林科学院 Data quality evaluation method and evaluation system
CN107509153B (en) * 2017-08-18 2020-01-14 Oppo广东移动通信有限公司 Detection method and device of sound playing device, storage medium and terminal
CN109903775B (en) * 2017-12-07 2020-09-25 北京雷石天地电子技术有限公司 Audio popping detection method and device
CN108111908A (en) * 2017-12-25 2018-06-01 深圳Tcl新技术有限公司 Audio quality determines method, equipment and computer readable storage medium
CN108540904B (en) * 2018-04-24 2020-10-09 深圳市战音科技有限公司 Method and device for improving sound effect of sound box
CN109147765B (en) * 2018-11-16 2021-09-03 安徽听见科技有限公司 Audio quality comprehensive evaluation method and system
CN109979487B (en) * 2019-03-07 2021-07-30 百度在线网络技术(北京)有限公司 Voice signal detection method and device
CN110033784B (en) * 2019-04-10 2020-12-25 北京达佳互联信息技术有限公司 Audio quality detection method and device, electronic equipment and storage medium
CN110111811B (en) * 2019-04-18 2021-06-01 腾讯音乐娱乐科技(深圳)有限公司 Audio signal detection method, device and storage medium
CN110265064B (en) * 2019-06-12 2021-10-08 腾讯音乐娱乐科技(深圳)有限公司 Audio frequency crackle detection method, device and storage medium
CN110931021B (en) * 2019-10-29 2023-10-13 平安科技(深圳)有限公司 Audio signal processing method and device
CN111816207B (en) * 2020-08-31 2021-01-26 广州汽车集团股份有限公司 Sound analysis method, sound analysis system, automobile and storage medium
CN113077821A (en) * 2021-03-23 2021-07-06 平安科技(深圳)有限公司 Audio quality detection method and device, electronic equipment and storage medium
CN113052138B (en) * 2021-04-25 2024-03-15 广海艺术科创(深圳)有限公司 Intelligent contrast correction method for dance and movement actions
CN113190508B (en) * 2021-04-26 2023-05-05 重庆市规划和自然资源信息中心 Management-oriented natural language recognition method
CN113448514B (en) * 2021-06-02 2022-03-15 合肥群音信息服务有限公司 Automatic processing system of multisource voice data
CN114400022B (en) * 2022-03-25 2022-08-23 北京荣耀终端有限公司 Method, device and storage medium for comparing sound quality

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100347988C (en) * 2003-10-24 2007-11-07 武汉大学 Broad frequency band voice quality objective evaluation method
CN1321400C (en) * 2005-01-18 2007-06-13 中国电子科技集团公司第三十研究所 Noise masking threshold algorithm based Barker spectrum distortion measuring method in objective assessment of sound quality
CN101436879B (en) * 2008-12-17 2012-01-11 北京航空航天大学 Method for extracting interfering signal amplitude with environmental noise
CN101859560B (en) * 2009-04-07 2014-06-04 林文信 Automatic marking method for karaok vocal accompaniment
JP5782402B2 (en) * 2012-03-29 2015-09-24 日本電信電話株式会社 Voice quality objective evaluation apparatus and method
US20130282373A1 (en) * 2012-04-23 2013-10-24 Qualcomm Incorporated Systems and methods for audio signal processing
EP2660814B1 (en) * 2012-05-04 2016-02-03 2236008 Ontario Inc. Adaptive equalization system
CN103632679A (en) * 2012-08-21 2014-03-12 华为技术有限公司 An audio stream quality assessment method and an apparatus
US20150255088A1 (en) * 2012-09-24 2015-09-10 Hitlab Inc. Method and system for assessing karaoke users
CN103716470B (en) * 2012-09-29 2016-12-07 华为技术有限公司 The method and apparatus of Voice Quality Monitor
CN103730131B (en) * 2012-10-12 2016-12-07 华为技术有限公司 The method and apparatus of speech quality evaluation
CN103077727A (en) * 2013-01-04 2013-05-01 华为技术有限公司 Method and device used for speech quality monitoring and prompting
CN104143341B (en) * 2013-05-23 2015-10-21 腾讯科技(深圳)有限公司 Sonic boom detection method and device
JP6098422B2 (en) * 2013-07-31 2017-03-22 ブラザー工業株式会社 Information processing apparatus and program
CN103632682B (en) * 2013-11-20 2019-11-15 科大讯飞股份有限公司 A kind of method of audio frequency characteristics detection
CN103971674B (en) * 2014-05-22 2017-02-15 天格科技(杭州)有限公司 Sing real-time scoring method
CN104064180A (en) * 2014-06-06 2014-09-24 百度在线网络技术(北京)有限公司 Singing scoring method and device
CN104050964A (en) * 2014-06-17 2014-09-17 公安部第三研究所 Audio signal reduction degree detecting method and system
CN104307100B (en) * 2014-10-10 2017-01-04 深圳大学 A kind of method and system improving artificial cochlea's pitch perception
CN104575520A (en) * 2014-12-16 2015-04-29 中国农业大学 Acoustic monitoring device and method combining psychological acoustic evaluation

Also Published As

Publication number Publication date
CN105989853A (en) 2016-10-05

Similar Documents

Publication Publication Date Title
CN105989853B (en) Audio quality evaluation method and system
TW445724B (en) Method for making a machine-aided assessment of the transmission quality of audio signals
JP2007041593A (en) Method and apparatus for extracting voiced/unvoiced classification information using harmonic component of voice signal
JP5542206B2 (en) Method and system for determining perceptual quality of an audio system
CA2633685A1 (en) Non-intrusive signal quality assessment
US20110246192A1 (en) Speech Quality Evaluation System and Storage Medium Readable by Computer Therefor
US20060200346A1 (en) Speech quality measurement based on classification estimation
US8566082B2 (en) Method and system for the integral and diagnostic assessment of listening speech quality
KR20100085962A (en) A method and system for speech intelligibility measurement of an audio transmission system
CN106663450A (en) Method of and apparatus for evaluating quality of a degraded speech signal
US20090161882A1 (en) Method of Measuring an Audio Signal Perceived Quality Degraded by a Noise Presence
US7818168B1 (en) Method of measuring degree of enhancement to voice signal
EP2438591B1 (en) A method and arrangement for estimating the quality degradation of a processed signal
Kates et al. Integrating cognitive and peripheral factors in predicting hearing-aid processing effectiveness
Lin et al. A composite objective measure on subjective evaluation of speech enhancement algorithms
CN116230018A (en) Synthetic voice quality evaluation method for voice synthesis system
Mahdi et al. New single-ended objective measure for non-intrusive speech quality evaluation
US20180061433A1 (en) Signal processing device, signal processing method, and computer program product
Ding et al. Objective measures for quality assessment of noise-suppressed speech
EP3718476B1 (en) Systems and methods for evaluating hearing health
CN112233693A (en) Sound quality evaluation method, device and equipment
Villavicencio et al. Extending efficient spectral envelope modeling to mel-frequency based representation
CN111816208A (en) Voice separation quality evaluation method and device and computer storage medium
Sharma et al. Non-intrusive speech intelligibility assessment
Balik et al. Assessment of Audio Signal Noise Reduction Based on Psychoacoustic Model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Yang Jiang

Inventor after: Zhang Jidong

Inventor after: Wu Weihao

Inventor before: Yang Jiang

Inventor before: Yang Bu

Inventor before: Wu Weihao

GR01 Patent grant
GR01 Patent grant