RU2757860C1

RU2757860C1 - Method for automatically assessing the quality of speech signals with low-rate coding

Info

Publication number: RU2757860C1
Application number: RU2021110011A
Authority: RU
Inventors: Виктор Алексеевич Аладинский; Сергей Владиславович Кузьминский; Павел Леонидович Смирнов
Original assignee: Общество с ограниченной ответственностью "Специальный Технологический Центр"
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2021-10-21

Abstract

FIELD: computer technology.

SUBSTANCE: invention relates to computer technology for processing audio data. The effect is achieved by comparing the image (m, C) of the input DS with the LRVE, which is formed in accordance with the known j-th protocol, and a single reference image (m_j _ref, С_{j ref}) of the j-th class, j=1,2,…,J obtained on the basis of the training sample with the maximum value of the speech signal quality estimate e_jmах. The divergence ν_j is used as a measure of distinguishing between them. Here m, m_j _ref are the vectors of the mathematical expectation of the analyzed DS and the j-th reference image, respectively, C, Cj et are the covariance matrices of the analyzed DS and the j-th reference image. Based on the functional dependence е_j=ƒ(ν_j), formed at the training stage and described analytically by a power polynomial, with a known value of the divergence v_j between the image (m, С) and one reference image (m_j _ref, С_{j ref}) of the j-th class, calculating the value of assessing the quality of the investigated DS with LRVE, formed according to the j-th protocol, without conversion to the PCM format.

EFFECT: ensuring automatic assessment of the quality of LRVE signals without converting the investigated digital stream (DS) to the pulse-code modulation (PCM) format, which ensures the establishment of a functional (analytical) relationship between the divergence values and the selected measures of the speech signal quality.

1 cl, 7 dwg

Description

Изобретение относится к области автоматической оценки качества речевых сигналов и может быть внедрено в системах контроля состояния цифровых телефонных радиолиний, использующих передачу сигналов с низкоскоростным кодированием речи (НСКР), при разработке вокодеров для оценки качества синтезируемых речевых сигналов, а также использовано на этапе анализа исходных данных при автоматическом распознавании речевых сообщений, передаваемых по цифровым линиям связи.The invention relates to the field of automatic assessment of the quality of speech signals and can be implemented in systems for monitoring the state of digital telephone radio lines using signal transmission with low-rate speech coding (NSCR), in the development of vocoders for assessing the quality of synthesized speech signals, and also used at the stage of analysis of the initial data with automatic recognition of speech messages transmitted over digital communication lines.

Заявленное техническое решение повышает эффективность средств аналогичного назначения при отсутствии исходного (сравниваемого) сигнала и исключении процедуры декодирования цифровых потоков, содержащих сообщения с НСКР.The claimed technical solution increases the efficiency of means of a similar purpose in the absence of the original (compared) signal and the exclusion of the decoding procedure for digital streams containing messages with NSCR.

Известен способ машинной оценки качества передачи речи (см. Патент РФ №2435232, МПК G10L 15/14, опубл. 27.11.2011, бюл. 33), в котором осуществляют загрузку звукового сигнала в оперативную память компьютера, выделяют в сигнале фрагменты активной и неактивной фаз, вычисляют спектры для каждой фазы, которые разделяют на критические полосы, рассчитывают значения спектральных параметров для каждой критической полосы как в спектральной, так и во временной областях, исключают из обработки фрагменты активной фазы, соответствующие тональному набору, до деления на критические полосы, осуществляют многоуровневую психоакустическую фильтрацию спектров, полученные параметры обрабатываемого сигнала сравнивают с ассоциациями, хранящимися в базе данных, и выбирают ассоциации, наиболее близкие по всем параметрам к обрабатываемому сигналу, а оценку качества речи определяют как сумму взвешенных значений степеней близости, получают значение машинной оценки качества речевого сигнала путем сравнения параметров обрабатываемого сигнала с параметрами моделей речи, хранящимися в базе ассоциаций.A known method of machine assessment of the quality of speech transmission (see RF Patent No. 2435232, IPC G10L 15/14, publ. 11/27/2011, bul. 33), in which the audio signal is loaded into the computer's RAM, the active and inactive fragments are selected in the signal phases, calculate the spectra for each phase, which are divided into critical bands, calculate the values of the spectral parameters for each critical band both in the spectral and time domains, exclude from processing the fragments of the active phase corresponding to the tonal set, before dividing into critical bands, carry out multilevel psychoacoustic filtering of spectra, the obtained parameters of the processed signal are compared with the associations stored in the database, and the associations that are closest in all parameters to the processed signal are selected, and the speech quality assessment is determined as the sum of the weighted values of the degrees of proximity, the value of the machine assessment of the speech signal quality is obtained by comparing the parameter ditch of the processed signal with the parameters of speech models stored in the association base.

Недостатком аналога является необходимость преобразования исследуемого сигнала в цифровой поток (ЦП) с импульсно-кодовой модуляцией (ИКМ). Данное преобразование для сигналов с низкоскоростным кодированием заключается в декомпрессия сжатого ЦП в приемной части вокодера, которое помимо существенных временных и вычислительных затрат неизбежно вызывает искажения в синтезируемом ЦП формата ИКМ, возрастающие при ухудшении качества канала радиосвязи. Другим недостатком аналога является конечное число ассоциаций (эталонных описаний), к одной из которых относят анализируемый сигнал, что предполагает сравнение образа входного сигнала со всеми эталонными описаниями. Недостаточное число эталонных описаний (например, не более трех эталонных описаний) не позволит оценивать качество речевого сигнала с требуемой точностью, а необоснованное увеличение их числа приведет к многократному увеличению вычислительных затрат.The disadvantage of the analogue is the need to convert the signal under study into a digital stream (CPU) with pulse-code modulation (PCM). This transformation for signals with low-rate coding consists in decompression of the compressed CPU in the receiving part of the vocoder, which, in addition to significant time and computational costs, inevitably causes distortions in the synthesized CPU of the PCM format, which increase with deterioration of the quality of the radio communication channel. Another disadvantage of the analogue is the finite number of associations (reference descriptions), one of which includes the analyzed signal, which implies comparison of the input signal image with all reference descriptions. An insufficient number of reference descriptions (for example, no more than three reference descriptions) will not allow assessing the quality of a speech signal with the required accuracy, and an unreasonable increase in their number will lead to a manifold increase in computational costs.

Наиболее близким к заявленному является способ (прототип) распознавания новых протоколов низкоскоростного кодирования речи (см. Патент РФ №2667462, МПК: G06K9/00, Н04 В1/06, опубл. 19.09.2018, бюл. №26), заключающийся в том, что принимают цифровой информационный поток Y в течение интервала времени ΔT, на основе принятого потока Y формируют нормированную автокорреляционную функцию А, по регулярным с равными интервалами Δτ экстремумам автокорреляционной функции А принимают решение о наличии блочной структуры в цифровом информационном потоке Y, по интервалам между экстремумами автокорреляционной функции А делят цифровой информационный поток Y на информационные блоки объемом N_б бит каждый, последовательно присваивают информационным блокам порядковые номера k=1,2,…, К, начиная с первого информационного блока, формируют прямоугольную информационную матрицу Y_K×L, L=N_б, строками которой являются последовательно размещенные друг под другом информационные блоки в соответствии с их порядковыми номерами k=1,2,…, К, поочередно выделяют столбцы

из матрицы Y_K×L с номерами

по каждому столбцу

информационной матрицы Y_K×L вычисляют значение математического ожидания

появления определенных импульсов, формируют вектор вычисленных значений математического ожидания

последовательным размещением полученных значений математического ожидания

в соответствии с их порядковыми номерами

на основе сформированного вектора значений математического ожидания m(0) путем последовательного циркулярного сдвига его значений на величину

L - 1 формируют набор М векторов

значений математического ожидания

формируют эталонные векторы значений математического ожидания m_j _эт, j=1,2,…,J, по каждому цифровому информационному потоку Y_j _эт, соответствующему j-му известному протоколу НСКР, каждый вектор значений математического ожидания

оцениваемого протокола НСКР последовательно сравнивают с эталонными векторами значений математического ожидания m_j _эт, j=1,2,…,J, вычисляют значение вероятности правильного распознавания

j-го протокола НСКР по каждому

вектору значений математического ожидания

принимают решение в пользу j-го протокола НСКР, для которого обеспечивается максимальное значение вероятности правильного распознавания

The closest to the claimed is a method (prototype) for recognizing new low-speed speech coding protocols (see RF Patent No. 2667462, IPC: G06K9 / 00, H04 B1 / 06, publ. 09/19/2018, bull. No. 26), which consists in that the digital information stream Y is received during the time interval ΔT, on the basis of the received stream Y, the normalized autocorrelation function A is formed, according to the regular with equal intervals Δτ extrema of the autocorrelation function A, a decision is made on the presence of a block structure in the digital information stream Y, according to the intervals between the extrema of the autocorrelation function functions A divide the digital information stream Y into information blocks of N _b bits each, sequentially assign sequence numbers k = 1,2, ..., K to the information blocks, starting from the first information block, form a rectangular information matrix Y _{K × L} , L = N _b , the lines of which are information blocks sequentially placed one under the other in accordance with their order with numbers k = 1,2, ..., K, columns

from the matrix Y _{K × L} with numbers

for each column

information matrix Y _{K × L} calculate the value of the mathematical expectation

the appearance of certain impulses, form a vector of calculated values of the mathematical expectation

sequential placement of the obtained values of the mathematical expectation

according to their serial numbers

based on the generated vector of values of the mathematical expectation m (0) by successive circular shift of its values by the value

L - 1 form a set of M vectors

values of mathematical expectation

form the reference vectors of the values of the mathematical expectation m _j _et , j = 1,2, ..., J, for each digital information stream Y _j _et , corresponding to the j-th known NSCR protocol, each vector of the values of the mathematical expectation

of the estimated NSCR protocol is sequentially compared with the reference vectors of the values of the mathematical expectation m _j _et , j = 1,2, ..., J, the value of the probability of correct recognition is calculated

j-th NSCR protocol for each

vector of values of mathematical expectation

make a decision in favor of the jth NSCR protocol, for which the maximum value of the probability of correct recognition is provided

Способ-прототип обеспечивает повышение точности и скорости распознавания протоколов НСКР в условиях воздействия помех. При этом о качестве речевых сигналов с низкоскоростным кодированием косвенно можно судить по величине вероятности правильного распознавания

исследуемого ЦП у и соответствующего ему эталонного образа, при условии, что оба образа принадлежат к одному классу - известному j-му протоколу НСКР.The prototype method provides an increase in the accuracy and speed of recognition of the NSCR protocols in conditions of interference. In this case, the quality of speech signals with low-rate coding can be indirectly judged by the value of the probability of correct recognition

the studied CPU y and the corresponding reference image, provided that both images belong to the same class - the known j-th protocol of the NSCR.

В качестве недостатка прототипа следует отметить отсутствие точного соответствия между используемой мерой отличия

ЦП у от эталонного описания и какой-либо мерой качества речевого сигнала.As a disadvantage of the prototype, it should be noted that there is no exact correspondence between the used measure of difference

Cpu y from the reference description and some measure of the quality of the speech signal.

Целью заявленного технического решения является разработка способа автоматической оценки качества сигналов с низкоскоростным кодированием речи без преобразования исследуемого ЦП у с НСКР к формату ИКМ, обеспечивающего установление функциональной (аналитической) зависимости между значениями дивергенции и выбранных мер качества речевого сигнала.The aim of the claimed technical solution is to develop a method for automatically assessing the quality of signals with low-rate speech coding without converting the studied CPU with NSCR to the PCM format, which ensures the establishment of a functional (analytical) relationship between the divergence values and the selected measures of the speech signal quality.

Поставленная цель достигается тем, что в известном способе автоматической оценки качества речевых сигналов с низкоскоростным кодированием, заключающемся в том, что принимают бинарный информационный цифровой поток у объемом N_ЦП бит в течение интервала времени ΔT, формируют на основе у нормированную автокорреляционную функцию а, принимают решение о наличии блочной структуры в информационном потоке у по регулярным с равными интервалами Δτ экстремумам автокорреляционной функции а, делят информационный ЦП у на информационные блоки объемом N_б бит каждый по интервалам между экстремумами автокорреляционной функции а, присваивают информационным блокам последовательно порядковые номера k=1,2,…,К, начиная с первого информационного блока, формируют прямоугольную информационную матрицу Y размеров К × Z, Z=N_б, строками которой являются последовательно размещенные друг под другом информационные блоки в соответствии с их порядковыми номерами k=1,2,…, К, выделяют из матрицы Y столбцы y_z, z=1,2,…,Z, определяют значения математического ожидания (МО) m_z по каждому столбцу y_z, формируют вектор значений МО

последовательным размещением значений МО m_z, формируют обучающую выборку {y_jw}_W, w=1,2,…,W, W - объем обучающей выборки, которая состоит из набора цифровых потоков y_jw, сформированных по заданному j-му протоколу НСКР и соответствующих максимально возможной субъективной оценке е_j _mах качества речевого сигнала, формируют эталонный ЦП у_jэт путем последовательной конкатенации цифровых потоков y_jw из состава обучающей выборки, формируют прямоугольную эталонную информационную матрицу Y_j _эт, строками которой являются последовательно размещенные друг под другом ЦП у_j _эт, вычисляют по эталонной матрице Y_j _эт эталонный вектор значений МО m_{j эт}, вектор МО m оцениваемого потока НСКР последовательно сравнивают с эталонными векторами МО m_j _эт, принимают решение о наиболее вероятной принадлежности оцениваемого потока НСКР, в случае принятия решения о использовании неизвестного протокола НСКР при формировании ЦП у оценку качества речевого сигнала прекращают, в противном случае на основе матрицы Y вычисляют ковариационную матрицу С, формируют образ информационного ЦП у в виде совокупности (m, С), вычисляют квадратную эталонную ковариационную матрицу С_j _эт, формируют эталонный образ в виде совокупности (m_j _эт, С_j _эт), искажают эталонный ЦП у_j _эт введением фиксированного числа ошибочных бит пропорционально значению вероятности битовой ошибки

G - количество уровней вероятности битовых ошибок в пределах от минимального Р_ош(1) до максимального P_ош(G) значений, формируют набор

эталонных цифровых потоков, искаженных битовыми ошибками, по каждому ЦП

аналогично составляют матрицу

совокупность которых образует набор

искаженных эталонных матриц, вычисляют соответствующие им векторы значений

и ковариационные матрицы

составляют образы (

),

искаженные битовыми ошибками, осуществляют последовательное сравнение эталонного образа (m_j _эт, С_{j эт}) и каждого из G образов (

), искаженных битовыми ошибками, путем вычисления значения дивергенции

между ними, устанавливают соответствие между значениями дивергенции

и значениями

оценки качества речевого сигнала, полученными на основе экспериментального или экспериментально-аналитического подходов, приводят соответствие между полученными наборами

и

значений к непрерывной функциональной зависимости вида е_j=ƒ(ν_j) методом интерполяции степенным многочленом, осуществляют сравнение образа (m, С) исследуемого информационного ЦП у с НСКР и эталонного образа (m_j _эт, С_j _эт) путем вычисления значения дивергенции ν_j, получают значение оценки е_j качества сигнала с НСКР путем вычисления степенного многочлена вида e_j=ƒ(ν_j) подстановкой в него известного ранее вычисленного значения ν_j, при выполнении условия ν_j<ν_j (1) значение оценки качества сигнала с НСКР соответствует величине e_j _max максимальной оценки качества речевого сигнала, при выполнении условия ν_j>ν_j (G) значение оценки качества сигнала с НСКР соответствует величине е_j _min минимальной оценки качества речевого сигнала.This goal is achieved by the fact that in the known method of automatic quality assessment of speech signals with low-rate coding, which consists in the fact that a binary information digital stream with a volume of N _CPU bits is received during the time interval ΔT, a normalized autocorrelation function a is formed on the basis of y, a decision is made on the presence of a block structure in the information flow y along the regular with equal intervals Δτ extrema of the autocorrelation function a, divide the information CPU y into information blocks of N _b bits each according to the intervals between the extrema of the autocorrelation function a, assign serial numbers k = 1,2 to the information blocks , ..., K, starting from the first information block, form a rectangular information matrix Y of sizes K × Z, Z = N _b , the rows of which are information blocks sequentially placed under each other in accordance with their ordinal numbers k = 1,2, ..., K, the columns y _z , z = 1,2, ..., Z, determine the values of the mathematical expectation (MO) m _z for each column y _z , form a vector of MO values

by sequentially placing the values of MO m _z , a training sample is formed {y _jw } _W , w = 1,2, ..., W, W - the volume of the training sample, which consists of a set of digital streams y _jw formed according to a given j-th NSCR protocol and corresponding to the maximum possible subjective assessment e _j _{max of the} quality of the speech signal, form a reference CPU at _jet by sequential concatenation of digital streams y _jw from the training sample, form a rectangular reference information matrix Y _j _et , the rows of which are consecutively placed one under the other CPUs at _j _et is calculated by reference matrix Y _j _fl reference vector MO m _{j et} vector MO m estimated flow NCIS successively compared with reference vectors MO m _j _et, make a decision on the most probable supplies estimated flow NCIS values, in case the decision to use an unknown protocol NSCR during the formation of the CPU, the assessment of the quality of the speech signal is stopped, otherwise, on Again, the matrix Y calculates the covariance matrix C, forms the image of the information CPU y in the form of a set (m, C), calculates the square reference covariance matrix C _j _et , form a reference image in the form of a set (m _j _et , C _j _et ), distort the reference CPU y _j _{et by} introducing a fixed number of erroneous bits in proportion to the value of the bit error probability

G - the number of levels of the probability of bit errors in the range from the minimum P _osh (1) to the maximum P _osh (G) values, form a set

reference digital streams, distorted by bit errors, for each CPU

similarly compose the matrix

the collection of which forms a set

distorted reference matrices, calculate the corresponding vectors of values

and covariance matrices

make up images (

),

distorted by bit errors, a sequential comparison of the reference image (m _j _et , C _{j et} ) and each of the G images (

) corrupted by bit errors by calculating the divergence value

between them, set the correspondence between the values of the divergence

and values

assessments of the quality of the speech signal, obtained on the basis of experimental or experimental-analytical approaches, bring the correspondence between the obtained sets

and

values to a continuous functional dependence of the form e _j = ƒ (ν _j ) by the method of interpolation by a power polynomial, compare the image (m, C) of the investigated information CPU y with NSCR and the reference image (m _j _et , C _j _et ) by calculating the divergence value ν _j , the value of the estimate e _j of the signal quality with the NSCR is obtained by calculating the power polynomial of the form e _j = ƒ (ν _j ) by substituting the previously known value ν _{j into it} , when the condition ν _j <ν _j (1) is satisfied, the value of the estimate of the signal quality with NSCR corresponds to the value e _j _{max of the} maximum estimate of the quality of the speech signal, when the condition ν _j > ν _j (G) is satisfied, the value of the estimate of the quality of the signal with NSCR corresponds to the value of e _j _{min of the} minimum estimate of the quality of the speech signal.

Благодаря новой совокупности существенных признаков в заявленном способе обеспечивается автоматическая оценка качества речевых сигналов с низкоскоростным кодированием за счет сравнения образа (m, С) входной реализации у и эталонного образа (m_j _эт, С_j _эт) с помощью меры различия, которая представлена дивергенцией ν_j.Thanks to the new set of essential features in the claimed method, an automatic assessment of the quality of speech signals with low-rate coding is provided by comparing the image (m, C) of the input implementation y and the reference image (m _j _et , C _j _et ) using the difference measure, which is represented by the divergence ν _j .

Заявленный способ поясняется чертежами, на которых показаны:The claimed method is illustrated by drawings, which show:

на фиг. 1 - порядок формирования эталонного ЦП у_j _эт;in fig. 1 - the order of the formation of the reference CPU at the _j _floor ;

на фиг. 2 - порядок формирования прямоугольной эталонной информационной матрицы Y_j _эт;in fig. 2 - the order of forming a rectangular reference information matrix Y _j _et ;

на фиг. 3 - алгоритм оценки качества речевых сигналов с низкоскоростным кодированием;in fig. 3 is an algorithm for evaluating the quality of speech signals with low-rate coding;

на фиг. 4 - зависимость значений e_зj разборчивости звуков речи от вероятности битовой ошибки Р_ош в речевых сигналах с низкоскоростным кодирование, сформированных по j-му протоколу LPC-10-2400 (STANAG 4197);in fig. 4 - the dependence of the values of e _zj intelligibility of speech sounds on the probability of a bit error P _osh in speech signals with low-rate coding, formed according to the j-th protocol LPC-10-2400 (STANAG 4197);

на фиг. 5 - табличная форма представления соответствия (3);in fig. 5 - tabular form of presentation of compliance (3);

на фиг. 6 - результаты формирования соответствия (3) для j-го протокола LPC-10-2400 (STANAG 4197) при G=8;in fig. 6 - the results of the formation of correspondence (3) for the j-th protocol LPC-10-2400 (STANAG 4197) with G = 8;

на фиг. 7 - графическая зависимость значений e_зj разборчивости звуков речи от дивергенции ν_j между исследуемым и эталонным образами, соответствующих j-му протоколу LPC-10-2400 (STANAG 4197).in fig. 7 is a graphical dependence of the values of e _{zj of} intelligibility of speech sounds on the divergence ν _j between the studied and reference images corresponding to the j-th protocol LPC-10-2400 (STANAG 4197).

Качество речи - величина, характеризующая субъективную оценку звучания речи в испытуемом тракте по пятибалльной шкале в сравнении с эталонным трактом. Одним из важных показателей качества речи является ее разборчивость - относительное количество правильно принятых элементов речи (звуков, слогов, слов, фраз), выраженное в процентах от общего числа переданных элементов. Между различными показателями качества речи установлены функциональные взаимосвязи, что позволяет получать значения требуемых показателей, в том числе и субъективные оценки звучания речи.Speech quality is a value that characterizes the subjective assessment of the sounding of speech in the tested tract on a five-point scale in comparison with the reference tract. One of the important indicators of speech quality is its intelligibility - the relative number of correctly received speech elements (sounds, syllables, words, phrases), expressed as a percentage of the total number of transmitted elements. Functional relationships have been established between various indicators of speech quality, which makes it possible to obtain the values of the required indicators, including subjective assessments of the sound of speech.

Оценку качества речевых сигналов осуществляют при анализе свойств речевого сигнала и его источника, а также для определения эффективности системы передачи речевых сообщений в целом или свойств отдельных ее элементов, в том числе средств НСКР (см. ГОСТ Р 51061-97 Системы низкоскоростной передачи речи по цифровым каналам. Параметры качества речи и методы измерений. - М.: Госстандарт России, 1997. - 24 с. ). Субъективные методы оценки качества речи требуют участия групп подготовленных экспертов, автоматизация в них касается процессов ввода и статистической обработки результатов артикуляционного тестирования. Объективные методы оценки качества основаны на анализе каких-либо параметров и характеристик исследуемого процесса, что позволяет полностью автоматизировать процесс оценки качества речевых сигналов. В известных методах автоматической оценки качества речи предполагается определение параметров речевого сигнала, представляемого в формате ИКМ, и выбор (с некоторой точностью) соответствующих им значений оценок качества речи. В этом случае на этапе обучения осуществляется построение психофизической шкалы и ее математическое или графическое описание, которое далее используется на этапе исследования принимаемых речевых сигналов (см. Михайлов В.Г., Златоустова Л.В. Измерение параметров речи / Под ред. М.А. Сапожкова. - М.: Радио и связь, 1987. - 168 с.; Шелухин О.И., Лукьянцев Н.Ф. Цифровая обработка и передача речи / Под ред. О.И. Шелухина. - М.: Радио и связь, 2000. - 456 с.).The quality of speech signals is assessed when analyzing the properties of a speech signal and its source, as well as to determine the effectiveness of the system for transmitting speech messages as a whole or the properties of its individual elements, including NSCR means (see GOST R 51061-97 Low-speed speech transmission systems for digital channels Speech quality parameters and measurement methods - M .: Gosstandart of Russia, 1997. - 24 p.). Subjective methods for assessing the quality of speech require the participation of groups of trained experts, automation in them concerns the processes of input and statistical processing of the results of articulation testing. Objective methods for assessing the quality are based on the analysis of any parameters and characteristics of the process under study, which makes it possible to fully automate the process of assessing the quality of speech signals. In the known methods of automatic speech quality assessment, it is assumed that the parameters of the speech signal represented in the PCM format are determined and the corresponding values of the speech quality estimates are selected (with some accuracy). In this case, at the training stage, the construction of a psychophysical scale and its mathematical or graphic description is carried out, which is further used at the stage of studying the received speech signals (see Mikhailov V.G., Zlatoustova L.V. Measurement of speech parameters / Ed. By M.A. Sapozhkova. - M .: Radio and communication, 1987. - 168 p .; Shelukhin O.I., Lukyantsev N.F.Digital processing and transmission of speech / Under the editorship of OI Shelukhin. - M .: Radio and communication, 2000 .-- 456 p.).

При исследовании качества речевых сигналов, передаваемых через системы радиосвязи диапазонов ВЧ/ОВЧ с использованием НСКР, рассматривают предположение о нормальных акустических условиях при формировании речевого сигнала по ГОСТ 7153 (см. ГОСТ Р 51061-97 Системы низкоскоростной передачи речи по цифровым каналам. Параметры качества речи и методы измерений. - М.: Госстандарт России, 1997. - 24 с.). При этом основное внимание уделяется исследованию качества сигналов, формируемых на выходе декодера. Результаты исследований представляют в виде зависимости значений ОКРС от вероятности битовой ошибки Р_ош в ЦП, переданном через канал связи.When studying the quality of speech signals transmitted through radio communication systems of the HF / VHF ranges using NSCR, the assumption of normal acoustic conditions when forming a speech signal according to GOST 7153 is considered (see GOST R 51061-97 Low-speed speech transmission systems over digital channels. Speech quality parameters and measurement methods. - M .: Gosstandart of Russia, 1997. - 24 p.). In this case, the main attention is paid to the study of the quality of signals generated at the output of the decoder. The research results are presented in the form of the dependence of the ORS values on the probability of a bit error P _osh in the CPU transmitted through the communication channel.

Таким образом, автоматическая оценка качества речевых сигналов с низкоскоростным кодированием, передаваемых с помощью средств радиосвязи, обеспечивает снижение вычислительных затрат, что определяет необходимость решения этой технической задачи.Thus, the automatic assessment of the quality of speech signals with low-rate coding, transmitted by means of radio communication, provides a reduction in computational costs, which determines the need to solve this technical problem.

Положительный эффект в предлагаемом способе достигается за счет сравнения образа (m, С) входного ЦП у с НСКР, который сформирован в соответствии с известным j-м протоколом, и единственного эталонного образа (m_jэт, С_j _эт) j-го класса, полученного на основе обучающей выборки с максимальным значением оценки качества речевого сигнала е_j _mах, при использовании меры различия между ними, в качестве которой выступает дивергенция.A positive effect in the proposed method is achieved by comparing the image (m, C) of the input CPU y with the NSCR, which is formed in accordance with the known j-th protocol, and the only reference image (m _jet , C _j _et ) of the j-th class obtained on the basis of the training sample with the maximum value of the assessment of the quality of the speech signal е _j _max , using the measure of the difference between them, which is the divergence.

На основе функциональной зависимости е_j=ƒ(ν_j), формируемой на этапе обучения и описываемой аналитически степенным многочленом, при известном значении дивергенции ν_j между образом (m, С) исследуемого ЦП у с НСКР и одним эталонным образом (m_j _эт, С_j _эт) j-го класса, обеспечивается вычисление значения оценки качества исследуемого ЦП у с НСКР, сформированного по j-му протоколу, без преобразования к формату ИКМ.Based on the functional dependence е _j = ƒ (ν _j ), formed at the training stage and described analytically by a power polynomial, with a known value of the divergence ν _j between the image (m, С) of the studied CP y with NSCR and one reference image (m _j _et , With _j _et ) of the j-th class, the calculation of the value of the quality assessment of the investigated CPU y with NSCR, formed according to the j-th protocol, without conversion to the PCM format is provided.

Реализация заявленного способа может быть осуществлена следующим образом (см. фиг. 3). До этапа ввода исходных данных целесообразно определить значения параметров ΔT, N_m информационного ЦП у, установить длительность интервала его анализа, определяемую величиной К - числом строк в прямоугольной информационной матрице Y, а также набор значений {N_б}, определить значение J, выбрать значение G и сформировать соответствующее количество обучающих выборок (y_jw}_W.The implementation of the claimed method can be carried out as follows (see Fig. 3). Before the stage of inputting the initial data, it is advisable to determine the values of the parameters ΔT, N _{m of the} information CPU y, set the duration of the interval of its analysis, determined by the value of K - the number of rows in the rectangular information matrix Y, as well as the set of values {N _b }, determine the value of J, select the value G and form the corresponding number of training samples (y _jw } _W.

Далее на подготовительном этапе (в режиме «Обучение») по обучающим выборкам {y_jw} (j=1, 2,…,J) формируют эталонные цифровые потоки {У_jэт}_J путем последовательной конкатенации цифровых потоков {y_Jw}_J(см. фиг. 1). Последние реализуются по j-м протоколам НСКР и имеют максимальное значение e_j _mах оценки качества речевого сигнала. Далее составляют эталонные матрицы {Y_jэт}_J и соответствующие им эталонные описания J известных протоколов НСКР вида (m_j _эт, С_{j эт}) (см. фиг. 2).Further, at the preparatory stage (in the "Training" mode), using training samples {y _jw } (j = 1, 2, ..., J), reference digital streams {У _jet } _{J are formed} by sequential concatenation of digital streams {y _Jw } _J (see Fig. 1). The latter are implemented according to the j-th NSCR protocols and have the maximum value e _j _max of the speech signal quality assessment. Further, the reference matrices {Y _jet } _J and the corresponding reference descriptions J of the known NSCR protocols of the form (m _j _et , C _{j et} ) are made (see Fig. 2).

Устанавливают пределы изменения значений

от минимального значения Р_ош(1), которое соответствует минимально ощутимому снижению качества речевого сообщения, до максимального значения P_ош(G), при котором уже невозможно восстановление речевого сообщения. Количество G уровней (значений) вероятности битовых ошибок определяет точность построения психофизической шкалы и, как следствие, точность оценки качества речевых сигналов с НСКР. Вместе с тем, при использовании экспериментального или экспериментально-аналитического подходов к оценке качества речевых сигналов необходимо выбрать достаточное количество 6≤G≤10 уровней вероятности битовых ошибок, например, на основе метода равномерного приближения. Последний обеспечивает минимизацию наибольшего значения абсолютной ошибки интерполяции (см. Корн Г., Корн Т. Справочник по математике для научных работников и инженеров: Пер. с англ. - М.: Наука, 1970. - 720 с.).Sets the limits for changing values

from the minimum value P _osh (1), which corresponds to the minimum perceptible decrease in the quality of the voice message, to the maximum value P _osh (G), at which it is no longer possible to restore the voice message. The number G of levels (values) of the probability of bit errors determines the accuracy of constructing a psychophysical scale and, as a consequence, the accuracy of assessing the quality of speech signals with NSCR. At the same time, when using experimental or experimental-analytical approaches to assessing the quality of speech signals, it is necessary to select a sufficient number of 6≤G≤10 levels of bit error probability, for example, based on the uniform approximation method. The latter ensures the minimization of the largest value of the absolute interpolation error (see G. Korn, T. Korn, Handbook of Mathematics for Scientists and Engineers: Translated from English - Moscow: Nauka, 1970. - 720 p.).

В соответствии со значениями

вероятности битовой ошибки в эталонный ЦП у_j _эт вводят фиксированное количество n_ошбитовых ошибок, определяемое выражениемAccording to the values

bit error in the master CPU y _j _et introduced fixed number n of bit errors _err determined by the expression

где N_э - количество символов (бит) в эталонном ЦП у_j _эт.where N _e - the number of symbols (bits) in the reference CPU at _j _et .

Формируют G цифровых потоков

с объемом N_э каждый, в которых ошибочные символы принимают значение «1», а остальные символы имеют значение «0». Распределение ошибочных символов в

- равновероятное. Формируют G цифровых потоков

, искаженных битовыми ошибками, путем поэлементного сложения эталонного ЦП у_j _эт по модулю 2 с каждым из G цифровых потоков

:Form G digital streams

with a volume of N _e each, in which the erroneous symbols take the value "1", and the rest of the symbols have the value "0". Distribution of erroneous characters in

- equiprobable. Form G digital streams

, distorted by bit errors, by element-wise addition of the reference CPU at _j _et modulo 2 with each of the G digital streams

:

где ⊕ - операция сложения по модулю 2.where ⊕ is the addition operation modulo 2.

На основе цифровых потоков

, искаженных битовыми ошибками, формируют набор искаженных эталонных матриц

. Далее вычисляют соответствующие им векторы значений МО

и ковариационные матрицы

(см. Аладинский В.А., Кузьминский С.В. Метод формирования признаков распознавания протоколов низкоскоростного кодирования речи // Наукоемкие технологии. - М.: Радиотехника. №12, 2015. - С. 20-25; Патент РФ №2667462, МПК G10L 19/008, Н03М 13/03, опубл. 19.09.2018, бюл. 26). На их основе составляют информационные образы (

), искаженные битовыми ошибками. После этого осуществляют последовательное сравнение эталонного образа (m_jэт, С_jэт) и образов (

между ними:Based on digital streams

corrupted by bit errors form a set of corrupted reference matrices

... Next, the corresponding vectors of MO values are calculated

and covariance matrices

(see Aladinsky V.A., Kuzminsky S.V. Method of formation of signs of recognition of low-speed speech coding protocols // Science-intensive technologies. - M .: Radio engineering. No. 12, 2015. - P. 20-25; RF Patent No. 2667462, IPC G10L 19/008, Н03М 13/03, publ. 19.09.2018, bul. 26). On their basis, information images (

) corrupted by bit errors. After that, a sequential comparison of the reference image (m _jet , C _jet ) and images (

) corrupted by bit errors by calculating the divergence value

in between:

где

,

- значения следов матриц А_j, B_jразмерности N; a_nn(j), b_nn(j) - соответствующие элементы диагоналей матриц A_j, В_j;

where

,

- values of traces of matrices А _j , B _{j of} dimension N; a _nn (j), b _nn (j) are the corresponding elements of the diagonals of the matrices A _j , B _j ;

(⋅)^T - операция транспонирования.

(⋅) ^T is a transposition operation.

На основе экспериментального (прямого) или экспериментально-аналитического (косвенного) подходов устанавливают соответствие между значениями оценки качества речи

и значениями дивергенции

:Based on the experimental (direct) or experimental-analytical (indirect) approaches, a correspondence is established between the values of the speech quality assessment

and divergence values

:

Экспериментальный подход при формировании соответствия (4) заключается в разделении каждой матрицы

на элементы

декодировании цифровых потоков

с помощью синтезатора приемной части вокодера к формату ИКМ и оценке качества этих речевых сигналов, подвергнутых искажению в канале связи. Оценку осуществляют на основе какого-либо известного субъективного либо объективного методов, например, PESQ (см. Recommendation ITU-T P.862. Perceptual Evaluation of Speech Quality. Geneva, 2001.-30 p.).An experimental approach to the formation of correspondence (4) consists in dividing each matrix

on elements

decoding of digital streams

with the help of the synthesizer of the receiving part of the vocoder to the PCM format and assessing the quality of these speech signals subjected to distortion in the communication channel. Evaluation is carried out on the basis of any known subjective or objective methods, for example, PESQ (see Recommendation ITU-T P.862. Perceptual Evaluation of Speech Quality. Geneva, 2001.-30 p.).

Экспериментально-аналитический (косвенный) подход реализуется при наличии априорных сведений о функциональной зависимости е_j=ƒ_j(P_ош) значений оценки качества речевых сигналов, синтезируемых в приемной части вокодера в соответствии с j-м протоколом НСКР, которые приведены в стандартах (см. Recommendation ITU-R F.1112-1. Digitized speech transmissions for systems operating below about 30 MHz. - Radiocommunication Study Group 8, Question ITU-R 164/9. 1995 - 15 p.) или представлены разработчиками (см. Бабкин В.В. Защита от ошибок и интерполяция потерь пакетов в низкоскоростных речевых кодеках // Электросвязь, №11, 2009. - С.47-49.) от вероятности битовой ошибки Р_ош. Зависимость е_j=f_j(P_ош) может быть представлена в графической или аналитической формах, что позволяет получить G значений

оценки качества речи при известных величинах

Далее осуществляют интерполяцию полученного в (4) соответствия из G сопоставленных друг другу значений дивергенции и оценок качества речи с помощью многочлена наименьшей степени. В результате получают (одним из известных способов) интерполяционную формулу степени не более (G-1) вида:The experimental-analytical (indirect) approach is implemented in the presence of a priori information about the functional dependence е _j = ƒ _j (P _ош ) values for assessing the quality of speech signals synthesized in the receiving part of the vocoder in accordance with the j-th NSCR protocol, which are given in the standards (see . Recommendation ITU-R F.1112-1. Digitized speech transmissions for systems operating below about 30 MHz. - Radiocommunication Study Group 8, Question ITU-R 164/9. 1995 - 15 p.) Or presented by developers (see Babkin V .V. Protection against errors and interpolation of packet loss in low-speed speech codecs // Elektrosvyaz, No. 11, 2009. - P.47-49.) From the probability of bit error P _osh . The dependence e _j = f _j (P _osh ) can be presented in graphic or analytical form, which allows you to obtain G values

assessing the quality of speech at known values

Next, the correspondence obtained in (4) is interpolated from the G matched values of divergence and speech quality estimates using the least degree polynomial. As a result, an interpolation formula of degree no more than (G-1) of the form is obtained (by one of the known methods):

где a_0j - свободный член;

-

коэффициент степенного многочлена для j-го протокола НСКР.where a _0j - free term;

-

coefficient of the power polynomial for the j-th NSCR protocol.

Регистрируют полученные результаты и выводят сообщение «Обучение завершено».The results obtained are recorded and the message "Training completed" is displayed.

В режиме «Оценка качества речевых сигналов» рассматривают матрицу Y (подход распространяется и на подобные ей матрицы Y_j _эт,

) каксистему случайных величин (СВ). Текущий символ у_kz (i), i=1,2,…, I - порядковый номер элемента (символа) алфавита случайной величины объемом I, которой является дискретной случайной величиной с объемом алфавита I=2 и принимает значение 1 при i=1 или 0 при i=2.In the mode "Assessment of the quality of speech signals", the matrix Y is considered (the approach is extended to similar matrices Y_j _this,

) howsystem of random variables (SV). Current character y_kz (i), i = 1,2, ..., I is the ordinal number of an element (symbol) of the alphabet of a random variable of volume I, which is a discrete random variable with an alphabet volume I = 2 and takes the value 1 for i = 1 or 0 for i = 2.

Определяют численные характеристики системы СВ Y={y₁, y₂, …, y_n, …, y_z, …, y_Z), где y_z={y_lz,y_2z,…,y_kn,…,y_kz,…,y_Kz) - столбец исследуемой матрицы Y, которыми являются значения МО m и ковариационная матрица С.Determine the numerical characteristics of the SV system Y = {y ₁ , y ₂ ,…, y _n ,…, y _z ,…, y _Z ), where y _z = {y _lz , y _2z ,…, y _kn ,…, y _kz ,…, Y _Kz ) is the column of the studied matrix Y, which are the values of MO m and the covariance matrix C.

Вычисляют значение МО m_z столбца y_z, состоящего из К двоичных символов y_kz, по следующей формуле (см. Вентцель Е.С., Овчаров Л.А. Теория вероятностей и ее инженерные приложения. - М.: Наука, 1988. - 480 с. - ISBN 5-02-013748-0):Calculate the value of MO m _{z of the} column y _z , consisting of K binary symbols y _kz , according to the following formula (see Ventzel E.S., Ovcharov L.A. Probability theory and its engineering applications. - M .: Nauka, 1988. - 480 p. - ISBN 5-02-013748-0):

где p_z(i) - вероятность появления i-го значения y_kz(i) в столбце y_z.where p _z (i) is the probability of occurrence of the i-th value y _kz (i) in the y _z column.

Так как у_kz(2)=0, то m_z=1⋅p_z(1)+0⋅p_z(2)=p_z(1). Вычисляют вероятность p_z(1) появления значения 1 в y_z по формуле:Since _kz (2) = 0, then m _z = 1⋅p _z (1) + 0⋅p _z (2) = p _z (1). Calculate the probability p _z (1) of the appearance of the value 1 in y _z by the formula:

где S_z(1) - количество символов y_kz со значением 1 в y_z.where S _z (1) is the number of symbols y _kz with value 1 in y _z .

Вычисляют последовательно значения МО по столбцам y_z матрицы Y, формируют наборThe MO values are sequentially calculated by the columns y _{z of the} matrix Y, a set is formed

Определяют ковариационную матрицу С размерности Z, которая включает значения коэффициента ковариацииDetermine the covariance matrix C of dimension Z, which includes the values of the covariance coefficient

где М[⋅] - математическая операция вычисления МО;

- столбцы (векторы), содержащие центрированные СВ:where M [⋅] is a mathematical operation for calculating MO;

- columns (vectors) containing centered CBs:

n, z - порядковые номера столбцов у_n и y_z СВ Y;n, z - ordinal numbers of columns at _n and y _z CB Y;

m_n, m_z - столбцы (векторы) размерности К, содержащие только значения m_n, m_z соответственно.m _n , m _z - columns (vectors) of dimension K containing only the values m _n , m _z, respectively.

Составляют образ входного информационного ЦП у с НСКР, описываемого набором (m, С).An image of the input information CPU is compiled with the NSCR described by the set (m, C).

Проверяют гипотезы о том, что входной информационный ЦП у сформирован по одному из J заданных (известных) протоколов НСКР. При подтверждении одной из гипотез считают, что при формировании ЦП у применен j-й протокол НСКР, в противном случае выводят сообщение «Протокол НСКР: не установлен, оценка качества: нет» и прекращают оценку качества речевого сигнала.The hypotheses are tested that the input information CPU y is formed according to one of the J specified (known) NSCR protocols. When one of the hypotheses is confirmed, it is considered that the j-th NSCR protocol is applied during the formation of the CPU, otherwise the message “NSCR protocol: not installed, quality assessment: no” is displayed and the assessment of the speech signal quality is stopped.

Вычисляют по формуле (3) значения дивергенции ν_j между образом ЦП у, представленным набором (m, С), и эталонным образом (m_j _эт, С_j _эт).Calculate by the formula (3) the values of the divergence ν _j between the image of the CPU y, represented by the set (m, C), and the reference image (m _j _et , C _j _et ).

Считают, что при выполнении условия ν_j<ν_j (1) значение оценки качества сигнала с НСКР соответствует максимальной оценке качества речевого сигнала е_j _mах. В противном случае, при выполнении условия ν_j>ν_j (G), значение оценки качества сигнала с НСКР соответствует минимальной величине e_j _min. При невыполнении рассмотренных условий рассчитывают по найденному значению дивергенции ν_j согласно (5) значение е_j оценки качества.It is considered that when the condition ν _j <ν _j (1) is fulfilled, the value of the estimate of the quality of the signal with the LSCR corresponds to the maximum estimate of the quality of the speech signal e _j _max . Otherwise, when the condition ν _j > ν _j (G) is satisfied, the value of the estimate of the signal quality with the LSCR corresponds to the minimum value of e _j _min . If the considered conditions are not met, the value of _{e j of} the quality assessment is _{calculated from the found value of the divergence ν j} according to (5).

Выводят сообщение «Протокол НСКР: j-й, оценка качества: е_j». После этого завершают оценку качества сигнала с низкоскоростным кодированием.The message "NSCR protocol: j-th, quality assessment: e _j " is displayed. Thereafter, the estimation of the quality of the low-rate coding signal is completed.

Имитационное моделирование заявленного способа автоматической оценки качества речевых сигналов с низкоскоростным кодированием проведено на примере исследования сигналов, сформированных на основе известного протокола НСКР LPC-10-2400 (STANAG 4197), который широко применяется на линиях радиосвязи диапазона высоких частот. По имеющейся обучающей выборке {y_jw}_W был сформирован эталонный ЦП у_jэт, составлена эталонная матрица y_jэт и соответствующее эталонное описание вида (m_jэт, С_jэт)Simulation modeling of the claimed method for automatic quality assessment of speech signals with low-rate coding was carried out on the example of studying signals generated on the basis of the well-known NSCR protocol LPC-10-2400 (STANAG 4197), which is widely used on radio communication lines in the high frequency range. Based on the available training sample {y _jw } _W , a reference CPU for _jet was formed, a reference matrix y _jet and a corresponding reference description of the form (m _jet , C _jet ) were compiled

В настоящее время известны данные о зависимости значений e_зj разборчивости звуков речи от вероятности битовой ошибки Р_ош в речевых сигналах с низкоскоростным кодированием (см. Recommendation ITU-R F.1112-1. Digitized speech transmissions for systems operating below about 30 MHz. - Radiocommunication Study Group 8, Question ITU-R 164/9. 1995 - 15 p.), сформированных по протоколу LPC-10-2400 (STANAG 4197). Последние приведены в графической форме (см. фиг. 4). В режиме «Обучение» был реализован экспериментально-аналитический (косвенный) подход для G=8, в результате чего получены значения в процентах

оценки качества речи при выбранных величинах

(см. фиг. 4).At present, data are known on the dependence of the values of e _sj intelligibility of speech sounds on the probability of a bit error P _osh in speech signals with low-rate coding (see Recommendation ITU-R F.1112-1. Digitized speech transmissions for systems operating below about 30 MHz. - Radiocommunication Study Group 8, Question ITU-R 164/9. 1995 - 15 p.), Formed using the LPC-10-2400 protocol (STANAG 4197). The latter are shown in graphical form (see Fig. 4). In the "Training" mode, an experimental-analytical (indirect) approach was implemented for G = 8, as a result of which the values were obtained in percent

assessing the quality of speech at the selected values

(see Fig. 4).

Для формирования соответствия вида (4) из эталонного ЦП у_j _эт на основе выражений (1) и (2) и G=8 сформированы цифровые потоки

искаженные битовыми ошибками, и соответствующие им матрицы

На основе матриц

составлены искаженные образы (

) и реализовано вычисление (выражение 3) соответствующих значений дивергенции

между эталонным и искаженными образами.To form a correspondence of the form (4) from the reference CPU at _j _et on the basis of expressions (1) and (2) and G = 8, digital streams are formed

corrupted by bit errors, and their corresponding matrices

Matrix-based

distorted images (

) and implemented the calculation (expression 3) of the corresponding values of the divergence

between reference and distorted images.

Выполнена интерполяция полученного соответствия (4), представленного строками 2 и 3 (см. фиг. 4). В результате расчетов получена интерполяционная формула видаInterpolation of the obtained correspondence (4), represented by lines 2 and 3 (see Fig. 4). As a result of calculations, an interpolation formula of the form

в которой члены со степенями

>3 не приведены ввиду малости величин

Последнее косвенно указывает на избыточность выбранного значения G=8. В графической форме зависимость вида (11) приведена на фиг. 7. Представленная интерполяционная формула позволяет оценивать качество речевых сигналов с низкоскоростным кодированием по протоколу LPC-10-2400 (STANAG 4197) с точностью не хуже 5%, что было установлено при исследовании цифровых потоков, сформированных по протоколу LPC-10-2400 (STANAG 4197), с известными значениями e_зj качества речевых сигналов.in which terms with powers

> 3 are not given due to the smallness of the values

The latter indirectly indicates the redundancy of the selected value G = 8. In graphical form, the dependence of the form (11) is shown in Fig. 7. The presented interpolation formula makes it possible to evaluate the quality of speech signals with low-rate coding according to the LPC-10-2400 (STANAG 4197) protocol with an accuracy of at least 5%, which was established by studying digital streams formed according to the LPC-10-2400 (STANAG 4197 ), with known values of e _zj quality of speech signals.

Claims

A method of automatic assessment of the quality of speech signals with low-speed encoding, comprising the steps of: receiving the binary information bit stream (CPU) of the volume of N _nn bit during the time interval ΔT, is formed on the basis of at normalized autocorrelation function and make a decision on availability of the block structure in information flow y along the regular with equal intervals Δτ extrema of the autocorrelation function a, divide the information CPU y into information blocks of N _b bits each according to the intervals between the extrema of the autocorrelation function a, assign the information blocks sequentially serial numbers k = 1,2, ..., K, starting from the first information block, a rectangular information matrix Y of sizes K × Z, Z = N _{b is formed} , the rows of which are information blocks sequentially placed under each other in accordance with their ordinal numbers k = 1,2, ..., K, are extracted from the matrix Y columns y _z , z = 1,2, ..., Z, define the values of the mathematical expectation i (MO) m _z for each column y _z , form a vector of values of MO m = (m ₁ , m ₂ , ..., m _z , ..., m _z ) by sequentially placing the values of MO m _z , form a training sample {y _jw } _W , w = 1,2, ..., W, W is the volume of the training sample, which consists of a set of digital streams y _jw formed according to a given j-th NSCR protocol and corresponding to the maximum possible subjective assessment e _jmax of the speech signal quality, form a reference CPU y _j _et by sequentially concatenating the digital streams y _jw from the training sample, forming a rectangular reference information matrix y _jet whose rows are sequentially arranged one below the other CPUs in _jet is calculated from the reference matrix y _jet reference vector MO m _jet values vector MO m of the estimated low-rate speech coding (NSCR) stream is sequentially compared with the reference vectors MO m _jet , a decision is made on the most probable membership of the estimated NSCR stream, the decision to use an unknown NSCR protocol when forming the CPU y stops the assessment of the quality of the speech signal, otherwise, on the basis of the matrix Y, the covariance matrix C is calculated, the image of the information CPU y is formed in the form of a set (m, C), the square reference covariance matrix C is calculated _jet , form a reference image in the form of a set (m _j _et , C _j _et ), distort the reference CPU at _{jet by} introducing a fixed number of erroneous bits in proportion to the value of the bit error probability

G - the number of levels of bit error probability in the range from the minimum P _osh (1) to the maximum P _osh (G) values; form a set

reference digital streams, distorted by bit errors, for each CPU

similarly compose the matrix

the collection of which forms a set

distorted reference matrices, calculate the corresponding vectors of values

and covariance matrices

make up images (

), distorted by bit errors, carry out a sequential comparison of the reference image (m _j _et , C _j _et ) and each of the G images (

) corrupted by bit errors by calculating the divergence value

between them, set the correspondence between the values of the divergence

and values

and

values to a continuous functional dependence of the form e _j = ƒ (ν _j ) by the method of interpolation by a power polynomial, compare the image (m, C) of the investigated information CPU y with NSCR and the reference image (m _j _et , C _j _et ) by calculating the divergence value ν _j, is obtained evaluation e _j signal quality value with NCIS by calculating the degree of the polynomial of the form f _j = ƒ (ν _j) substitution of the previously known computed value ν _j, when the condition ν _j <ν _j (1) the value of the quality evaluation signal with NSCR corresponds to the value e _{jmax of the} maximum estimate of the quality of the speech signal, when the condition ν _j > ν _j (G) is satisfied, the value of the estimate of the quality of the signal with the NSCR corresponds to the value e _{j min of the} minimum estimate of the quality of the speech signal.