RU2765108C1

RU2765108C1 - Method and hardware and software complex for pre-medical preliminary classifying multifactorial assessment of possibility of human auditory analyzer during mass preventive examinations of the population

Info

Publication number: RU2765108C1
Application number: RU2021114556A
Authority: RU
Inventors: Валерий Степанович Сироткин; Владимир Владимирович Ханыков
Original assignee: Общество С Ограниченной Ответственностью "Центр Коррекции Слуха И Речи "Мелфон" (Ооо "Цкср "Мелфон")
Priority date: 2021-05-24
Filing date: 2021-05-24
Publication date: 2022-01-25

Abstract

FIELD: medicine.

SUBSTANCE: present invention refers to the means and methods that provide automated execution of diagnostic procedures in terms of performing a pre-medical preliminary classifying multifactorial assessment of the possibility of a human auditory analyzer during mass preventive examinations of the population. A method for pre-medical preliminary classifying multifactorial assessment of the possibility of a human auditory analyzer during mass preventive examinations of the population is proposed, performed using a computing device connected to audio signal playback devices and containing stages in which: using a computing device, a primary test speech sequence (TSS) is formed, which is sentences consisting of the first number of words based on a matrix test; a noise competing sound is formed for primary TSS; the primary TSS is reproduced using audio signal playback devices made in the form of air and bone sound transmission headphones, while the TSS is reproduced simultaneously with the noise competing sound at the first signal-to-noise ratio using speech simulation based on a deep machine learning model; the user receives an oral response; performs automatic analysis of the user's oral response to TSS recognition by converting it into a text form and analyzing the correctness of the response using a machine learning model; moreover, on the basis of the performed analysis of the user's oral responses, a dynamic change in the complexity of the assessment is carried out, in which, according to the result of each automatic analysis, a change in the number of words in sentences forming the TSS and/or the signal/noise ratio of the reproduced signal is performed; an assessment of the possibility of the user's auditory analyzer is performed based on responses when reproducing a test speech sequence.

EFFECT: invention provides an automated pre-medical preliminary classifying assessment of the possibility of a human auditory analyzer during mass preventive examinations of the population.

6 cl, 14 dwg, 3 tbl

Description

ОБЛАСТЬ ТЕХНИКИFIELD OF TECHNOLOGY

Настоящее техническое решение относится к средствам и методам, обеспечивающим автоматизированное выполнение диагностических процедур в части выполнения доврачебной предварительной классифицирующей многофакторной оценки возможности слухового анализатора человека при проведении массовых профилактических осмотров населения.This technical solution relates to means and methods that provide automated execution of diagnostic procedures in terms of performing a pre-medical preliminary classifying multifactorial assessment of the possibility of a person's auditory analyzer during mass preventive examinations of the population.

УРОВЕНЬ ТЕХНИКИBACKGROUND OF THE INVENTION

Основной характеристикой любого канала передачи речи, включая слуховой тракт восприятия речи человеком, является понятность речи. Для определения этой характеристики в технических системах связи применяют статистический метод с участием большого числа слушателей и дикторов.The main characteristic of any speech transmission channel, including the auditory tract of human speech perception, is the intelligibility of speech. To determine this characteristic in technical communication systems, a statistical method is used with the participation of a large number of listeners and speakers.

Под разборчивостью речи понимают относительное или процентное количество принятых (понятых) элементов речи из общего числа, переданных по каналу связи. Элементы речи составляют слоги, звуки, слова, фразы, числа. В соответствии им поставлены слоговая, звуковая, словесная, смысловая и числовая разборчивость. Для измерения разборчивости разработаны артикуляционные таблицы слогов, звукосочетаний и слов с учетом встречаемости их в русской речи.Speech intelligibility is understood as the relative or percentage of received (understood) speech elements out of the total number transmitted over the communication channel. The elements of speech are syllables, sounds, words, phrases, numbers. In accordance with them, syllabic, sound, verbal, semantic and numerical intelligibility are set. To measure intelligibility, articulation tables of syllables, sound combinations and words have been developed, taking into account their occurrence in Russian speech.

Термином «социальная адекватность слуха» обозначают способность человека воспринимать звуковые стимулы различной сложности (включая речевые) и участвовать в диалоге. У людей с уровнем слуха ниже «социально адекватного» возникают сложности в общении с окружающими, появляются проблемы на работе, в быту. Так называемая стигма тугоухости заставляет людей скрывать свою коммуникативную проблему десятилетиями!The term "social adequacy of hearing" refers to the ability of a person to perceive sound stimuli of varying complexity (including speech) and participate in a dialogue. People with a hearing level below "socially adequate" have difficulties in communicating with others, problems at work, at home. The so-called hearing loss stigma makes people hide their communication problem for decades!

Исследования слуховой функции осуществляется посредством двух групп методов: Субъективных (психоакустических):The study of auditory function is carried out through two groups of methods: Subjective (psychoacoustic):

- исследование слуха речью с шумом;- study of hearing by speech with noise;

- исследование слуха при помощи камертонов;- study of hearing with tuning forks;

- субъективная аудиометрия.- subjective audiometry.

Объективных:Objective:

- объективная (компьютерная) аудиометрия;- objective (computer) audiometry;

- акустическая рефлексометрия;- acoustic reflexometry;

- тимпанометрия;- tympanometry;

- ото акустическая эмиссия;- from acoustic emission;

- безусловные рефлекторные реакции;- unconditioned reflex reactions;

- условные реакции на звук.- conditioned responses to sound.

При всех субъективных методах исследования слуха сам испытуемый оценивает: слышит он звук или нет и каким-либо способом сообщает об этом специалисту.With all subjective methods of hearing research, the subject himself evaluates whether he hears the sound or not and informs the specialist about it in some way.

При объективных методах обследования полученные результаты не зависят от желания пациента, регистрация их в большинстве случаев происходит при помощи специальной аппаратуры.With objective methods of examination, the results obtained do not depend on the desire of the patient; in most cases, they are recorded using special equipment.

К сожалению, результаты практически всех видов диагностики слуха, кроме прямой оценки разборчивости слуха речью, описывают результаты обследования в специфических терминах (децибелы, аудиограммы, номера пиков кривых и т.д.) и не дают пациенту объективной информации об его «реальной степени социальной адекватности». Все это множество научных терминов не дает человеку прямого ответа: как хорошо, или как плохо, он слышит и понимает «обычную» речь собеседника в реальном шуме, окружающем его ежедневно. Многочисленные результаты современных высокоточных обследований нужны специалистам; человеку с нарушениями слуха нужно знать только одно насколько хорошо он понимает речь собеседника в обычных условиях и требуется дальнейшее обращение за медицинской помощью.Unfortunately, the results of almost all types of hearing diagnostics, except for a direct assessment of hearing intelligibility by speech, describe the results of the examination in specific terms (decibels, audiograms, curve peak numbers, etc.) and do not provide the patient with objective information about his "real degree of social adequacy ". All this set of scientific terms does not give a person a direct answer: how well, or how badly, he hears and understands the “usual” speech of the interlocutor in the real noise that surrounds him daily. Numerous results of modern high-precision examinations are needed by specialists; a person with a hearing impairment needs to know only one thing, how well he understands the speech of the interlocutor under normal conditions, and further medical attention is required.

Самым простым и доступным методом является исследование слуха речью в шумовом сигнале. Достоинства этого метода заключаются в его соответствии основной роли слуховой функции у человека - служить средством речевого общения.The simplest and most accessible method is the study of hearing by speech in a noise signal. The advantages of this method lie in its compliance with the main role of the auditory function in humans - to serve as a means of verbal communication.

При исследовании слуха речью применяется шепотная и громкая речь. Конечно, оба эти понятия не включают точной дозировки силы и высоты звука, однако некоторые показатели, определяющие динамическую (силовую) и частотную характеристику шепотной и громкой речи, все же имеются.In the study of hearing by speech, whispered and loud speech is used. Of course, both of these concepts do not include the exact dosage of the strength and pitch of the sound, however, there are still some indicators that determine the dynamic (power) and frequency response of whispered and loud speech.

При исследовании слуха речью весь речевой материал произносится на резервном воздухе (вдох-выдох-речь). Это способствует уравниванию громкости при предъявлении всего речевого материала у разных лиц.In the study of hearing by speech, all speech material is pronounced in reserve air (inhale-exhale-speech). This contributes to equalizing the loudness upon presentation of all speech material in different persons.

Важным обстоятельством при исследовании слуха является "заглушение" неисследуемого уха. Есть несколько способов заглушения: вложить в ушной проход ватку с вазелином, ввести в ушной проход палец, смоченный водой, вдавить в слуховой проход козелок уха, потирать тыльную сторону ладони, закрывающей ухо, другой рукой!An important circumstance in the study of hearing is the "muffling" of the unexplored ear. There are several ways to muffle: put cotton wool with petroleum jelly into the ear canal, insert a finger moistened with water into the ear canal, press the ear tragus into the ear canal, rub the back of the hand covering the ear with the other hand!

Основным преимуществом исследования слуха речью является его «физиологическая понятность» для испытуемого. Основными проблемными препятствиями для широкого применения являются:The main advantage of the study of hearing by speech is its "physiological comprehensibility" for the subject. The main problematic barriers to widespread adoption are:

- невозможность обеспечения воспроизводимости результатов для разных испытателей, так и для одного испытателя в разное время;- the impossibility of ensuring reproducibility of results for different testers, and for one tester at different times;

- относительную длительность и трудоемкость испытаний;- relative duration and complexity of tests;

- необходимость выполнения исследования специально обученным медицинским персоналом.- the need to perform the study by specially trained medical personnel.

Из существующего уровня техники известны различные подходы по оценке распознаваемости речи. Известно программное обеспечение DIRAC (htip://asm-tm.ru/7841-izmerenie-razborchivosti-rechi-v-po-dirac.html), которое позволяет оценить акустическую обстановку помещения на предмет распознаваемости речи внутри него. Однако данный подход не применим для тестирования слуховых возможностей пациентов.From the existing prior art, various approaches are known for assessing speech recognition. Known software DIRAC (htip://asm-tm.ru/7841-izmerenie-razborchivosti-rechi-v-po-dirac.html), which allows you to evaluate the acoustic environment of the room for recognizable speech inside it. However, this approach is not applicable to testing the auditory abilities of patients.

Известен способ диагностики уровня слуха (патент RU2467691 С1, 27.11.2012), в котором используют речевую таблицу В. Воячека, которую записывают в память цифрового устройства в звуковом формате mp3. После повторения пациентом слов определяют процент правильно повторяемых от общего числа слов таблицы. При этом используют клавиатуру устройства для регистрации ответа пациента при громкостях звукового сигнала 10, 20 и 30% от максимальной мощности наушников соответственно. Затем выявляют снижение процента разборчивости речи. При снижении процента разборчивости речи менее 95% по любому размеру мощности наушников судят о наличии тугоухости.There is a method for diagnosing the level of hearing (patent RU2467691 C1, 11/27/2012), which uses the speech table of V. Voyachek, which is recorded in the memory of a digital device in the audio format mp3. After the patient repeats the words, the percentage of correctly repeated from the total number of words in the table is determined. In this case, the keyboard of the device is used to record the patient's response at the sound signal volumes of 10, 20 and 30% of the maximum headphone power, respectively. Then, a decrease in the percentage of speech intelligibility is detected. With a decrease in the percentage of speech intelligibility of less than 95%, the presence of hearing loss is judged by any size of headphone power.

Недостатком данного решения является недостаточная точность оценки разборчивости речи пациентом, которая заключается в отсутствии автоматизации процесса распознавания ответов пациента, с помощью перевода ответов пациента из голосового формата в текстовый для анализа правильности услышанных слов с помощью искусственного интеллекта.The disadvantage of this solution is the lack of accuracy in assessing the intelligibility of speech by the patient, which consists in the lack of automation of the process of recognizing the patient's answers by translating the patient's answers from voice to text format to analyze the correctness of the words heard using artificial intelligence.

С развитием уровня техники в области технологий, применяемых для диагностических целей различного профиля, в том числе и для проверки качества слуха, речевого распознавания и аудиометрии, все более насущной проблемой становится необходимость автоматизации всех ключевых функций, которые могут быть доступны конечному пользователю без необходимости прибегания к услугам профильных специалистов, а также упрощающих получение первичной картины состояния органов слуха, что позволяет решить проблему получения помощи населению в регионах, в которых отсутствует возможность обращения к профильному специалисту, как к таковому.With the development of the state of the art in the field of technologies used for diagnostic purposes of various profiles, including for testing the quality of hearing, speech recognition and audiometry, the need to automate all key functions that can be accessed by the end user without the need to resort to services of specialized specialists, as well as simplifying the receipt of a primary picture of the state of the hearing organs, which allows solving the problem of obtaining assistance to the population in regions where there is no possibility of contacting a specialized specialist as such.

В докладе ВОЗ 2021 (https://www.who.int/ru/news-room/fact-sheets/detail/deafness-and-hearing-loss) года отмечается:The WHO 2021 report (https://www.who.int/ru/news-room/fact-sheets/detail/deafness-and-hearing-loss) notes:

«Более 5% населения мира, или 430 миллионов человек, нуждаются в реабилитации для решения проблемы «инвалидизирующей» потери слуха (432 миллиона взрослых и 34 миллиона детей). По оценкам, к 2050 г. более 700 миллионов человек, или каждый десятый, будут иметь инвалидизирующую потерю слуха.“More than 5% of the world's population, or 430 million people, need rehabilitation to address 'disabling' hearing loss (432 million adults and 34 million children). It is estimated that by 2050 more than 700 million people, or one in ten, will have disabling hearing loss.

«Инвалидизирующей» называется потеря слуха в слышащем лучше ухе, превышающая 35 децибел (дБ). Почти 80% таких людей живет в странах с низким и средним уровнем дохода. Потеря слуха более широко распространена среди более возрастных людей: от этой проблемы страдают более 25% людей в возрасте старше 60 лет."Disabling" is defined as hearing loss in the better ear that exceeds 35 decibels (dB). Nearly 80% of these people live in low- and middle-income countries. Hearing loss is more common among older people: more than 25% of people over the age of 60 suffer from this problem.

К числу эффективных мер, направленных на сокращение количества случаев потери слуха и принимаемых на разных стадиях жизни человека, относятся следующие:Effective interventions to reduce hearing loss at different stages of a person's life include:

Раннее выявление потери слуха и заболеваний уха имеет решающее значение для эффективного ведения пациентов.

Early detection of hearing loss and ear disease is critical to effective patient management.

Для этого необходим систематический скрининг с целью выявления болезней ушей и связанной с ними потери слуха среди следующих категорий людей, подверженных наибольшему риску:

This requires systematic screening to identify ear diseases and associated hearing loss among the following categories of people most at risk:

новорожденные и грудные дети, о дети дошкольного и школьного возраста.

newborns and infants, children of preschool and school age.

люди, подвергающиеся воздействию шума или химических веществ на работе.

people exposed to noise or chemicals at work.

люди, принимающие ототоксичные лекарственные препараты, о люди пожилого возраста»

people taking ototoxic drugs, o elderly people"

Специалистами ВОЗ предложена рекомендуемая схема мероприятий по проведению массовых профилактических осмотров. Процедура основана на последовательном выполнении исследования слуха тремя способами: шепотной речью, скрининговой аудиометрии по воздушной проводимости или оценки разборчивости речи на распознавание двузначных цифр в шуме.WHO experts have proposed a recommended scheme of measures for conducting mass preventive examinations. The procedure is based on sequentially performing a hearing test in three ways: whispered speech, screening air conduction audiometry, or speech intelligibility assessment for recognition of two-digit digits in noise.

Исследования шепотной речью предполагают обязательное наличие специально подготовленного человека, способного длительное время обеспечивать одинаковую громкость воспроизведения тестовых слов. При массовых тестах, когда число испытуемых находится в районе 30-70 чел/сеанс это исключено. Результаты такого тестирования, проведенные разными специалистами часто трудно сопоставимы.Studies of whispered speech require the presence of a specially trained person who is able to provide the same volume of test words for a long time. With mass tests, when the number of subjects is in the region of 30-70 people / session, this is excluded. The results of such testing conducted by different specialists are often difficult to compare.

Трудоемкость процесса недопустимо велика для практического применения.The complexity of the process is unacceptably high for practical application.

Скрининговая аудиометрия в ограниченном частотном диапазоне (500 Гц - 4000 Гц) только по воздушной проводимости позволяет оценить только уровень «слышимости» в относительно спокойной обстановке, но не определяет возможность нарушения распознавания речи в шуме, а также требует обязательного личного участия медицинского работника.Screening audiometry in a limited frequency range (500 Hz - 4000 Hz) only by air conduction allows assessing only the level of "hearing" in a relatively calm environment, but does not determine the possibility of impaired speech recognition in noise, and also requires the mandatory personal participation of a medical worker.

Наиболее «продвинутая» методика автоматизированного распознавания речи в шуме, используя только двузначные числа, также обладает существенным недостатком -значительный эффект «узнавания» хорошо знакомых сочетаний по длительности звучания. Применяется при наличии признаков деменции у пожилых людей.The most "advanced" method of automated speech recognition in noise, using only two-digit numbers, also has a significant drawback - a significant effect of "recognition" of well-known combinations by the duration of the sound. It is used in the presence of signs of dementia in the elderly.

Участие специально подготовленного специалиста, высокая трудоемкость и низкая информативность методики для использования в последующих диагностических тестах, проводимых уже профильными специалистами для постановки медицинского диагноза и выработки рекомендации по коррекции слуха, резко снижают практическую ценность разработки.The participation of a specially trained specialist, the high labor intensity and low information content of the technique for use in subsequent diagnostic tests, which are already carried out by specialized specialists to make a medical diagnosis and develop recommendations for hearing correction, sharply reduce the practical value of the development.

Авторами ранее для решения существующих проблем в области автоматизированной оценки слуха и распознавания речи, был предложен метод и реализующий его программно-аппаратный комплекс (ПАК), обеспечивающие доврачебную оценку качества распознавания речи и скрининговой аудиометрии (патент РФ №2743049).Previously, to solve existing problems in the field of automated hearing assessment and speech recognition, the authors proposed a method and a hardware-software complex (HSC) that implements it, providing a pre-medical assessment of the quality of speech recognition and screening audiometry (RF patent No. 2743049).

Данное решение легло в основу заявленного изобретения и предлагает новый принцип исследований, применяющий динамически адаптируемый сценарий проведения многофакторного тестирования пациента, что позволяет быстро и эффективно оценить возможности слухового анализатора человека при поведении массовых профилактических осмотров населения.This solution formed the basis of the claimed invention and offers a new research principle that uses a dynamically adaptable scenario for conducting multifactorial testing of a patient, which allows you to quickly and efficiently assess the capabilities of a person's auditory analyzer during mass preventive examinations of the population.

СУЩНОСТЬ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

Заявленное изобретение решает техническую проблему в части обеспечения быстрого и эффективного метода автоматизированной доврачебной предварительной классифицирующей оценки возможности слухового анализатора человека при поведении массовых профилактических осмотров населения.The claimed invention solves a technical problem in terms of providing a fast and efficient method for an automated pre-medical preliminary classifying assessment of the possibility of a human auditory analyzer during mass preventive examinations of the population.

Технически результатом является обеспечение автоматизированной доврачебной предварительной классифицирующей оценки возможности слухового анализатора человека при поведении массовых профилактических осмотров населения.The technical result is to provide an automated pre-medical preliminary classifying assessment of the possibility of a person's auditory analyzer during the behavior of mass preventive examinations of the population.

Дополнительный эффект заключается в повышении достоверности определения степени наличия или отсутствия нарушений слуха, за счет применения динамически адаптируемого сценария при проведении тестирования пациента.An additional effect is to increase the reliability of determining the degree of presence or absence of hearing impairment, through the use of a dynamically adaptable scenario when testing a patient.

Заявленный технический результат достигается при помощи способа доврачебной предварительной классифицирующей много фактор ной оценки возможности слухового анализатора человека при поведении массовых профилактических осмотров населения, выполняемого с помощью вычислительного устройства, соединенного с устройствами воспроизведения аудиосигналов и содержащего этапы, на которых:The claimed technical result is achieved using the method of pre-medical preliminary classifying multifactorial assessment of the possibility of a human auditory analyzer in the behavior of mass preventive examinations of the population, performed using a computing device connected to audio signal playback devices and containing the steps at which:

с помощью вычислительного устройстваusing a computing device

- формируют первичную тестовую речевую последовательность (ТРП), которая представляет собой предложения, состоящие из первого количества слов на основании матричного теста;- form a primary test speech sequence (PTS), which is a sentence consisting of the first number of words based on the matrix test;

- формируют шумовой конкурирующий звук для первичной ТРП;- form a noise competing sound for the primary TRP;

- воспроизводят первичную ТРП с помощью устройствами воспроизведения аудиосигналов, выполненных в виде наушников воздушного и костного звукопроведения, при этом воспроизведение ТРП выполняется одновременно с шумовым конкурирующим звуком при первом соотношении сигнал/шум с помощью имитации речи на базе модели глубокого машинного обучения;- primary TRP is reproduced using audio signal playback devices made in the form of air and bone conduction headphones, while TRP playback is performed simultaneously with the noise competing sound at the first signal-to-noise ratio using speech simulation based on a deep machine learning model;

- получают устный ответ пользователя;- receive an oral response from the user;

- выполняют автоматический анализ устного ответа пользователя по распознаванию ТРП, путем его преобразования в текстовый вид и анализа правильности ответа с помощью модели машинного обучения;- perform automatic analysis of the user's oral response to recognize the TRP, by converting it into a text form and analyzing the correctness of the answer using a machine learning model;

причемand

- на основании выполняемого анализа устных ответов пользователя осуществляют динамическое изменение сложности проводимой оценки, при которой по итогу выполняемого каждого автоматического анализа выполняется изменение количества слов в предложениях формирующих ТРП и/или соотношение сигнал/шум воспроизводимого сигнала;- based on the analysis of the user's oral responses, a dynamic change in the complexity of the assessment is carried out, in which, as a result of each automatic analysis performed, the number of words in the sentences forming the TRP and / or the signal-to-noise ratio of the reproduced signal is changed;

- выполняют оценку возможности слухового анализатора пользователя на основании откликов при воспроизведении тестовой речевой последовательности.- performing an assessment of the capabilities of the user's auditory analyzer based on the responses when reproducing the test speech sequence.

В одном из частных вариантов реализации способа при анализе устных ответов пользователя определяют количество верных совпадений слов в первичной ТРП.In one of the particular embodiments of the method, when analyzing the user's oral responses, the number of correct word matches in the primary TRP is determined.

В другом частом варианте реализации способа выполняют уменьшение количества слов в ТРП при условии, если количество верных совпадений ниже установленного порогового значения.In another frequent embodiment of the method, the reduction in the number of words in the TRP is performed, provided that the number of correct matches is below a predetermined threshold value.

В другом частом варианте реализации способа выполняют уменьшение громкости в ТРП при условии, если количество верных совпадений выше установленного порогового значения.In another frequent implementation of the method, a volume reduction is performed in the TRP, provided that the number of correct matches is above a predetermined threshold value.

В другом частом варианте реализации способа выполняют усиление шумового сигнала в ТРП при условии, если количество верных совпадений выше установленного порогового значения.In another frequent embodiment of the method, amplification of the noise signal is performed in the TDP, provided that the number of correct matches is above a set threshold value.

Заявленный технический результат достигается также за счет программно-аппаратного комплекса (ПАК) для доврачебной предварительной классифицирующей многофакторной оценки возможности слухового анализатора человека при поведении массовых профилактических осмотров населения, который содержит вычислительное устройство и средства воспроизведения аудиосигналов, в котором:The claimed technical result is also achieved by means of a hardware-software complex (HSC) for a pre-medical preliminary classifying multifactorial assessment of the possibility of a person's auditory analyzer during mass preventive examinations of the population, which contains a computing device and means of reproducing audio signals, in which:

вычислительное устройство выполнено с возможностьюthe computing device is configured to

- формирования тестовой речевой последовательности в виде предложений, состоящих из первого количества слов с заданным уровнем громкости и на заданном языке на основании матричного теста, поддерживающего выбранный язык, причем воспроизведение тестовой речевой последовательности осуществляется с помощью имитации речи на базе модели глубокого машинного обучения;- forming a test speech sequence in the form of sentences consisting of the first number of words with a given volume level and in a given language based on a matrix test that supports the selected language, and the reproduction of the test speech sequence is carried out using speech simulation based on a deep machine learning model;

- выбора шумового конкурирующего звука для упомянутой тестовой речевой последовательности слов;- selecting a noise competing sound for said test speech sequence of words;

- получения устного ответа пользователя;- receiving an oral response from the user;

- автоматического анализа устного ответа пользователя по распознаванию каждого слова и/или фразы тестовой последовательности слов с преобразованием его в текстовый вид и анализа правильности ответа;- automatic analysis of the user's oral response by recognizing each word and / or phrase of the test sequence of words with its transformation into a text form and analyzing the correctness of the answer;

- изменения формируемых предложений, состоящих из второго количества слов, меньшего чем значение первого количества, в случае, когда количество распознанных слов в воспроизводимых предложениях менее заданного порогового значения;- changes in the generated sentences, consisting of the second number of words, less than the value of the first number, in the case when the number of recognized words in the reproduced sentences is less than a specified threshold value;

- оценки возможности слухового анализатора пользователя на основании откликов при воспроизведении тестовой речевой последовательности;- assessing the capabilities of the user's auditory analyzer based on the responses during the reproduction of the test speech sequence;

устройства воспроизведения аудиосигналов, выполненные в виде наушников воздушного и костного звукопроведения, с помощью которых осуществляется воспроизведение тестовой речевой последовательности, при этом воспроизведение тестовой речевой последовательности выполняется в виде фраз из матричного теста или отдельных слов в сопровождении конкурирующего шумового сигнала.audio signal playback devices made in the form of air and bone conduction headphones, with the help of which the test speech sequence is reproduced, while the test speech sequence is reproduced in the form of phrases from the matrix test or individual words, accompanied by a competing noise signal.

КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙBRIEF DESCRIPTION OF THE DRAWINGS

Фиг. 1 иллюстрирует общую схему интерактивного ПАК.Fig. 1 illustrates the general layout of an interactive HAC.

Фиг. 2А иллюстрирует блок-схему заявленного способа.Fig. 2A illustrates a flow diagram of the claimed method.

Фиг. 2Б иллюстрирует блок-схему адаптивного изменения сложности тестирования.Fig. 2B illustrates a flowchart for adaptively changing test difficulty.

Фиг. 3А-3К иллюстрируют пример протокола тестирования.Fig. 3A-3K illustrate an example test protocol.

Фиг. 4 иллюстрирует общий вид вычислительного устройства.Fig. 4 illustrates a general view of the computing device.

ОСУЩЕСТВЛЕНИЕ ИЗОБРЕТЕНИЯIMPLEMENTATION OF THE INVENTION

На Фиг. 1 представлена общая схема интерактивного ПАК (100). В его состав входит вычислительное устройство (ПО), представляющее собой планшет или смартфон, под управлением операционной системы iOS, Android или Windows. Устройство (ПО) осуществляется весь необходимый функционал, обеспечивающий взаимодействие с пользователем (10) для проведения оценки качества распознавания речи и скрининговой аудиометрии.On FIG. 1 shows the general scheme of the interactive PAC (100). It includes a computing device (software), which is a tablet or smartphone running the iOS, Android or Windows operating system. The device (software) provides all the necessary functionality that provides interaction with the user (10) to assess the quality of speech recognition and screening audiometry.

Вычислительное устройство (ПО) изготавливается на базе стандартных аппаратных средств, с тем отличием, что его звуковой тракт проходит предварительную процедуру метрологической калибровки и тарификации шкалы громкости в дБ, для его соответствия существующим аудиометрическим стандартам, например, для обеспечения его функционирования в соответствие с ГОСТ Р ИСО 8253-3-2014 «Акустика. Методы аудиометрических испытаний».The computing device (software) is manufactured on the basis of standard hardware, with the difference that its audio path undergoes a preliminary procedure of metrological calibration and billing of the loudness scale in dB, in order to comply with existing audiometric standards, for example, to ensure its operation in accordance with GOST R ISO 8253-3-2014 “Acoustics. Methods of audiometric tests".

Вычислительное устройство (ПО) содержит подключаемые к нему посредством канала передачи данных, устройство ввода речевой информации (111) и устройства звукового воспроизведения (112, 113).Computing device (software) contains connected to it via a data transmission channel, a device for inputting speech information (111) and audio playback devices (112, 113).

Устройство ввода речевой информации (111) может выполняться в виде встроенного или внешнего микрофона. Внешнее исполнение устройства (111) может подключаться с помощью любого пригодного принципа связи, например, USB кабель, Lightning разъем, Bluetooth связь и т.п.The voice input device (111) can be in the form of a built-in or external microphone. The external version of the device (111) can be connected using any suitable connection principle, eg USB cable, Lightning connector, Bluetooth connection, etc.

Устройства звукового воспроизведения (112, 113) представляют собой два типа наушников воздушной (например, Sennheiser hd 400s) и костной проводимости (например, Aftershock trekz titanium), которые обеспечивают возможность селективной двухканальной передачи звука пользователю (101), формируемого вычислительным устройством (110). Наушники (112, 113) калибруются с проверкой, например, с помощью приложения Tester, для соответствия выходных параметров изделия (громкость, частота) аудиометрическому оборудованию. Калибровочные параметры должны соответствовать международным и/или национальным стандартам, например, ГОСТ Р МЭК 60645-1-2017. Калибровка устройств звукового воспроизведения (112, 113) осуществляется с помощью искусственного уха, например, например производства фирмы В&К.Audio playback devices (112, 113) are two types of air (for example, Sennheiser hd 400s) and bone conduction (for example, Aftershock trekz titanium) headphones that provide the possibility of selective two-channel sound transmission to the user (101) generated by a computing device (110) . The earphones (112, 113) are calibrated with a check, for example using the Tester application, to match the output parameters of the product (volume, frequency) of the audiometric equipment. Calibration parameters must comply with international and / or national standards, for example, GOST R IEC 60645-1-2017. Calibration of sound reproduction devices (112, 113) is carried out using an artificial ear, for example, manufactured by B&K.

Вычислительное устройство (110) также может быть связано посредством сети передачи данных (120), например, сети «Интернет», с удаленным сервером (130), на котором может храниться различная информация, в том числе, настройки, данные пользователей, пакеты для обновления данных, параметры и информация для осуществления тестов и т.п.The computing device (110) can also be connected via a data network (120), for example, the Internet, to a remote server (130), which can store various information, including settings, user data, update packages data, parameters and information for performing tests, etc.

Вычислительное устройство (110) обеспечивает формирование всех необходимых сигналов, звуков и графической информации для осуществления процесса тестирования пользователя (101) на предмет оценки распознавания речи и получения данных скрининговой аудиометрии.The computing device (110) generates all the necessary signals, sounds and graphic information for the user testing process (101) to evaluate speech recognition and obtain screening audiometry data.

На Фиг. 2А представлено описание процесса осуществления способа (200) предварительной классифицирующей много фактор ной оценки возможности слухового анализатора человека с помощью ПАК (100). Применение заявленного ПАК позволяет осуществить доврачебную объективированную и конкретизированную оценку возможности слухового анализатора конкретного человека, при этом, данный ПАК и применяемые в нем алгоритмы полностью автоматизированы и просты в использовании.On FIG. 2A is a description of the process for implementing the method (200) of a pre-classifying multivariate assessment of the capability of the human auditory analyzer using the AAA (100). The use of the claimed PAC allows for a pre-medical objectified and specific assessment of the capabilities of the auditory analyzer of a particular person, while this PAC and the algorithms used in it are fully automated and easy to use.

На первом этапе (201) осуществляется формирование с помощью устройства (110) тестовой речевой последовательности (далее ТРП) для чего пользователь (101) осуществляется вход в специализированное программное приложение на устройстве (110). Пользователь (101) осуществляет выбор языка тестовой последовательности, например, с помощью графического интерфейса приложения, или автоматически с помощью произнесения фразы, предложенной приложением для автоматического распознавания языка пользователя (101). Приложение, применяемое для тестирования с помощью устройства (110), основано на базе искусственного интеллекта, в частности, одной или нескольких моделях машинного обучения, например, с применением искусственной нейронной сети, обученной на распознавание речи пользователя (101).At the first stage (201), a test speech sequence (hereinafter referred to as TSP) is formed using the device (110), for which the user (101) enters a specialized software application on the device (110). The user (101) selects the language of the test sequence, for example, using the graphical interface of the application, or automatically by speaking a phrase suggested by the application for automatically recognizing the user's language (101). The application used for testing with the device (110) is based on artificial intelligence, in particular one or more machine learning models, for example using an artificial neural network trained to recognize the user's speech (101).

После распознавания требуемого языка для осуществления формирования ТРП, выполняется выбор соответствующего матричного теста для заданного языка. Такие типы тестов являются стандартизованными и позволяют определить степень возможности распознавания речи в шумовом сигнале (см, например, Nuesse et al. Measuring Speech Recognition With a Matrix Test Using Synthetic Speech // Trends Hear. 2019 Jan-Dec; 23: 2331216519862982. Published online 2019 Jul 19. doi: 10.1177/2331216519862982). ТРП формируется из предложений, состоящих из фиксированного количества слов. Как пример, первичная ТРП формируется из предложений, состоящих из пяти слов, с последующим изменением их количества на основании откликов пользователей, проходящих тестирование.After recognizing the required language for the implementation of the formation of the TRP, the selection of the appropriate matrix test for the given language is performed. These types of tests are standardized and allow you to determine the degree of speech recognition ability in a noise signal (see, for example, Nuesse et al. Measuring Speech Recognition With a Matrix Test Using Synthetic Speech // Trends Hear. 2019 Jan-Dec; 23: 2331216519862982. Published online 2019 Jul 19. doi: 10.1177/2331216519862982). The TRP is formed from sentences consisting of a fixed number of words. As an example, the initial PR is formed from sentences consisting of five words, with the subsequent change in their number based on the responses of users undergoing testing.

Далее на этапе (202) выполняется установка конкурирующего шумового звука, который будет воспроизводиться одновременно с ТРП матричного теста. Перед запуском теста можно установить громкость речевого сигнала и громкость шума, количество слов в тесте, отключить левый или правый канал.Next, in step (202), setting the competing noise sound to be played simultaneously with the matrix test PRT is performed. Before starting the test, you can set the volume of the speech signal and the volume of the noise, the number of words in the test, turn off the left or right channel.

Впервые в практике массовых скриннинговых обследований состояния слуха предложено включить в число обязательных тестирование звукопроводящих путей по костной проводимости с использованием в качестве преобразователя специальным образом калиброванные с используемым планшетом стереофонические беспроводные наушники костной проводимости с расширенным до 20000 Гц частотным диапазоном.For the first time in the practice of mass screening examinations of the state of hearing, it was proposed to include in the mandatory testing of sound-conducting paths by bone conduction using stereo wireless headphones of bone conduction with a frequency range extended to 20,000 Hz as a transducer, specially calibrated with the tablet used.

Речевая последовательность использует уникальные звуковые (речевые и шумовые) файлы собственной разработки. Перечень этих файлов может изменяться и дополняться в зависимости от конкретных целей исследования слуха.The speech sequence uses unique sound (speech and noise) files of our own design. The list of these files can be changed and supplemented depending on the specific goals of the hearing test.

Сформированная последовательность слов и фраз матричного теста на этапе (203) воспроизводится с помощью синтезатора речи, построенного на базе моделей машинного обучения. При этом, для более точного тестирования для каждого пользователя (101) может выбираться соответствующий тип голоса (женский, мужской, детский и т.п.), а также одновременно проведения тестирования на нескольких языках, с возможностью их переключения, что важно для людей, живущих в стране с несколькими официальными языками (например, Швейцария).The generated sequence of words and phrases of the matrix test at step (203) is reproduced using a speech synthesizer built on the basis of machine learning models. At the same time, for more accurate testing for each user (101), the appropriate type of voice (female, male, children's, etc.) can be selected, as well as simultaneously testing in several languages, with the possibility of switching them, which is important for people living in a country with several official languages (eg Switzerland).

Речь диктора синтезируется автоматически с заданным уровнем громкости и темпом речи. Полная матрица ТРП позволяет объективно оценить не только уровень распознавания речи в шуме, но и когнитивные возможности человека (101).The speaker's speech is synthesized automatically with the specified volume level and speech rate. The full TRP matrix makes it possible to objectively assess not only the level of speech recognition in noise, but also the cognitive capabilities of a person (101).

Перед запуском теста можно установить некоторые режимы проведения теста, в частности:Before starting the test, you can set some test modes, in particular:

выбрать проверяемое ухо (оба, правое или левое);

select the ear to be tested (both, right or left);

установить счетчик тестовых фраз;

set the counter of test phrases;

выбрать голос;

choose a voice;

установить соотношение громкости полезного сигнала и шума одним из типичных значений или вручную.

set the volume ratio of the useful signal and noise to one of the typical values or manually.

Важной особенностью заявленного решения является его реализация в части воспроизведения речи с помощью программного синтезатора с заданными параметрами (мужской, женский, детский, хриплый, шепотом и т.п.), с заданным уровнем громкости, скорости, разборчивости произношения фонем, слов, предложений с соблюдением интонационных особенностей данного национального языка.An important feature of the claimed solution is its implementation in terms of speech reproduction using a software synthesizer with specified parameters (male, female, child, hoarse, whisper, etc.), with a given volume level, speed, intelligibility of pronunciation of phonemes, words, sentences with observance of the intonational features of the given national language.

Важной отличительной чертой ПАК (100) является возможность проводить оценку разборчивости речи по костной проводимости с использованием наушников соответствующего типа, в частности, стереофонических наушников (113) костной проводимости с расширенным частотным диапазоном, прошедшим метрологическую калибровку на соответствие требованиям ГОСТ. Это впервые дает возможность проведения оценки разборчивости речи у людей с заболеваниями системы звукопроведения (отиты различной формы), подавая речевые сигналы через костную проводимость непосредственно к улитковой системе.An important distinguishing feature of PAK (100) is the ability to assess speech intelligibility by bone conduction using headphones of the appropriate type, in particular, bone conduction stereo headphones (113) with an extended frequency range that have passed metrological calibration for compliance with GOST requirements. This makes it possible for the first time to assess the intelligibility of speech in people with diseases of the sound conduction system (otitis media of various forms), by applying speech signals through bone conduction directly to the cochlear system.

В ПАК (100) предусмотрена подсистема вычислительного устройства (110) калибровки звукового оборудования. Звуковые тракты применяемых устройств (110) разного вида, различных моделей, различные воздушные и костные наушники обладают разными характеристиками. Кроме того, устройство (110) обеспечивает возможность регулировки громкость выходного сигнала только в условных единицах от 0.0 до 1.0.The PAC (100) provides a subsystem of the computing device (110) for calibrating audio equipment. The sound paths of the applied devices (110) of different types, different models, different air and bone headphones have different characteristics. In addition, the device (110) provides the ability to adjust the volume of the output signal only in conventional units from 0.0 to 1.0.

В сурдологии принято измерять уровень звукового сигнала в децибелах (дБ). Этот уровень зависит от уровня звукового давления логарифмически. Для преобразования условных единиц громкости выходного сигнала устройства (110) в дБ была разработана методика калибровки звукового оборудования, которую необходимо проводить для каждой конкретной модели применяемого устройства (110), воздушных (112) и костных наушников (113).In audiology, it is customary to measure the level of an audio signal in decibels (dB). This level depends on the sound pressure level logarithmically. To convert conventional units of the output signal volume of the device (110) into dB, a method for calibrating audio equipment was developed, which must be carried out for each specific model of the device used (110), air (112) and bone headphones (113).

Предусмотрена автоматическая система защиты слуховой системы пользователя (101) от акустической перегрузки. Максимальный прослушиваемый уровень ограничен значением в 95 дБ на частоте 1000-3000 Гц, с возможностью срочного отключения звука с помощью графического интерфейса приложения для выполнения теста, запущенного на устройстве (110).An automatic system is provided to protect the user's auditory system (101) from acoustic overload. The maximum listening level is limited to 95 dB at 1000-3000 Hz, with the ability to immediately mute the sound using the graphical interface of the application to perform the test running on the device (110).

Сама по себе необходимость подавать уровни громкости прослушивания, превышающие значение 95 дБ, означает наличие у человека тяжелой потери слуха, требующего только срочного профессионального обследования профильным специалистом. Эта сигнальная информация формируется автоматически.By itself, the need to deliver listening levels in excess of 95 dB means that a person has severe hearing loss, requiring only urgent professional examination by a specialist in the field. This signal information is generated automatically.

Применяемые для реализации программного синтезатора речи алгоритмы машинного обучения, могут представлять собой, например, Google Cloud AI & Machine Learning Products Speech-to-Text (https://cloud.google.com/speech-to-text). Google Cloud AI & Machine Learning Products Text-to-Speech (https://cloud.google.com/text-to-speech). Облачные сервисы обработки и анализа речи (Облако ЦРТ - технологии синтеза и распознавания речи (speechpro.com)) или любые другие алгоритмы, которые могут быть обучены для целей реализации настоящего технического решения в составе интерактивного ПАК.Machine learning algorithms used to implement a software speech synthesizer can be, for example, Google Cloud AI & Machine Learning Products Speech-to-Text (https://cloud.google.com/speech-to-text). Google Cloud AI & Machine Learning Products Text-to-Speech (https://cloud.google.com/text-to-speech). Speech processing and analysis cloud services (MDG Cloud - speech synthesis and recognition technologies (speechpro.com)) or any other algorithms that can be trained for the purpose of implementing this technical solution as part of an interactive HSS.

На этапе (203) приложение на устройстве (110) генерирует и воспроизводит слова, формирующие предложения ТРП соответствующего матричного теста с помощью наушников с воздушной проводимостью (112). По факту произнесения слов и фраз с помощью синтезатора речи, пользователь дает отклик (этап 204) с помощью взаимодействия с интерфейсом устройства (110) или с помощью произнесения каждого слова и/или фразы теста. Голосовой ответ пользователя фиксируется с помощью микрофона (111) и преобразуется в текстовый формат для его анализа на правильность произнесенной фразы. В ходе теста в нижней части экрана устройства (110) может отображаться перечень всех вариантов ответа как его поняла система распознавания.At step (203), the application on the device (110) generates and plays the words that form the sentences of the TRP of the corresponding matrix test using air conduction headphones (112). Upon speaking the words and phrases using the speech synthesizer, the user responds (step 204) by interacting with the device interface (110) or by speaking each word and/or phrase of the test. The user's voice response is captured using a microphone (111) and converted into text format for analysis for the correctness of the spoken phrase. During the test, a list of all answer options as understood by the recognition system can be displayed at the bottom of the device screen (110).

Отклик пользователя (101) оценивается устройством (110) с помощью программного модуля на базе модели машинного обучения, который переводит ответ пользователя (101) в текст и сравнивает его с воспроизведенным словом или фразой в ТРП.The user's response (101) is evaluated by the device (110) using a machine learning model-based software module that translates the user's response (101) into text and compares it with the rendered word or phrase in the PRT.

Речевая последовательность воспроизводится параллельно с конкурирующим шумовым звуком, чтобы более точно оценить степень разборчивости речи пользователем (101) в эмулируемой ситуации. Речевые звуковые файлы и соответствующие им текстовые файлы могут хранится на вычислительном устройстве (110), что обеспечивает формирование произвольной комбинации любого речевого звукового файла с любым шумовым, не изменяя программу тестирования, а также дополнять список файлов произвольными сигналами и шумами.The speech sequence is played in parallel with the competing noise sound to more accurately assess the degree of speech intelligibility by the user (101) in the emulated situation. Speech sound files and their corresponding text files can be stored on the computing device (110), which ensures the formation of an arbitrary combination of any speech sound file with any noise file without changing the testing program, as well as supplementing the list of files with arbitrary signals and noise.

После озвучивания одного синтезированного предложения пользователь (101) повторяет услышанное предложение так, как он его услышал и понял, после чего на этапе (205) выполняется автоматическое обнаружение начала речевого ответа и производится его пословное сравнение с тестовым вариантом, в ходе которого подсчитывается количество неправильно произнесенных или пропущенных слов.After voicing one synthesized sentence, the user (101) repeats the heard sentence as he heard and understood it, after which, at step (205), the beginning of the speech response is automatically detected and compared word by word with the test variant, during which the number of incorrectly pronounced or missing words.

Предусмотрена несколько тестов разной степени сложности на все возрастные группы:There are several tests of varying degrees of complexity for all age groups:

1. скрининговая оценка разборчивости слуха у детей:1. screening assessment of hearing intelligibility in children:

- возрастная группа от 7 до 14 лет.- age group from 7 to 14 years.

2. Сбалансированные тесты (двухзначные числа)2. Balanced tests (two digits)

3. Таблицы разносложных слов, содержащие все фонемы русского языка (Гринберг Г.И., Зиндер Л.Р.)3. Tables of polysyllabic words containing all the phonemes of the Russian language (Grinberg G.I., Zinder L.R.)

4. Таблицы фонемосбалансированных слов (Нейман)4. Tables of phonemically balanced words (Neyman)

При проведении теста разборчивости речи в шуме обычно рекомендуется использовать в качесте помехи белый или розовый шумы, имитирующие речевой спектр голосов большой группы людей, стоящих на больших открытых пространствах. При этом, шумовой звук может быть смоделирован, выбором из большого диапазона различных ситуаций, наиболее подходящих для ситуации тестируемого пользователя, например, совещание, шумы цеха, стройка, шум толпы в закрытом помещении и т.п.When conducting a speech intelligibility test in noise, it is usually recommended to use white or pink noise as interference, imitating the speech spectrum of the voices of a large group of people standing in large open spaces. In this case, the noise sound can be simulated by selecting from a wide range of different situations most suitable for the situation of the user being tested, for example, a meeting, workshop noises, construction sites, indoor crowd noise, and the like.

ПАК (100) обеспечивает возможность пользователю (101) самому выбрать тот режим проверки, в котором в его повседневной жизни он испытывает наибольший дискомфорт, в частности, тип голоса (мужской, женский, детский и т.п.), уровень громкости беседы в общепринятых терминах: шепотная речь, нормальный уровень громкости разговора, громкий разговор в группе. При этом устройство (110) автоматически заменит выбранный «словесный» уровень описания громкости на соответсвующий метрологически подтвержденный уровень интенсивности прослушивания в дБ (шепотная речь - 35 дБ, нормальная речь - 50дБ и т.д.).PAC (100) allows the user (101) to choose the verification mode in which he experiences the greatest discomfort in his daily life, in particular, the type of voice (male, female, children's, etc.), the volume level of the conversation in generally accepted terms: whispered speech, normal conversation volume, loud conversation in a group. In this case, the device (110) will automatically replace the selected “verbal” loudness description level with the corresponding metrologically confirmed listening intensity level in dB (whispered speech - 35 dB, normal speech - 50 dB, etc.).

При анализе откликов тестируемых пользователей (101) на этапе (205) оценивается уровень их когнитивных возможностей, что учитывается впоследующем при адаптивной подтсройке теста на этапе (206). Оценка откликов пользователей (101) осуществляется устройством (110) по международной шкале STI с помощью соответствующей программной логики на базе модели машинного обучения.When analyzing the responses of the tested users (101), at step (205), the level of their cognitive capabilities is assessed, which is taken into account in the subsequent adaptive tuning of the test at step (206). The evaluation of user responses (101) is carried out by the device (110) according to the international STI scale using the appropriate software logic based on a machine learning model.

Как представлено на Фиг. 2Б, изменение хода тестирования происходит при вычислении распознанных откликов, т.е. слов, в ходе проигрывания первичной ТРП. Если тестируемый пользователь (101) набирает оценку, ниже порогового значения, при распознавании слов в ТРП на эьтапе (2061), то программная логика устройства (110) формирует упрощенную ТРП (этап (2062)), при которой уменьшается количество слов в предложениях, например, три вместо 5, или шумовой сигнал становится тише, а общий сигнал воспроизведения ТРП - громче. Если тестируемый пользователь (101) на этапе (2061) распознает верно все слова в первичной ТРП, или достаточное количество для получения оценки, превышающей пороговое значение, то на этапе (2063) формируется усложненный сценарий тестирования, при котором количество слов в формируемой ТРП может быть увеличено, или соотношение сигнал/шум становится горомче.As shown in FIG. 2B, a change in the test progress occurs when the recognized responses are calculated, i.e. words, during the playback of the primary TRP. If the user under test (101) scores below the threshold when recognizing words in the AT at the stage (2061), then the device program logic (110) generates a simplified AT (step (2062)), which reduces the number of words in sentences, for example , three instead of 5, or the noise signal becomes quieter and the overall playback signal of the TRP becomes louder. If the user under test (101) at step (2061) correctly recognizes all the words in the primary PR, or a sufficient number to obtain a score exceeding the threshold value, then at step (2063) a sophisticated test scenario is generated, in which the number of words in the generated PR can be increased, or the signal-to-noise ratio becomes louder.

Адаптивная подстройка тестирования на этапе (206) позволяет также оценить следующий уровень когнитивной сложности звукового анализатора пользователя (101). Также, при усложнении сценария тестирования могут применяться следующие тестовые наборы:The adaptive adjustment of testing in step (206) also allows the next level of cognitive complexity of the user's audio analyzer to be assessed (101). Also, if the test scenario becomes more complex, the following test suites can be used:

- двузначные числа;- two-digit numbers;

- набор фонемосбалансированных слов Неймана;- a set of phonemically balanced Neumann words;

- набор тестовых слов, содержащий все фонемы русского языка (таблицы Гринберга).- a set of test words containing all the phonemes of the Russian language (Grinberg tables).

Вышеприведенный пример используется для русского языка. В англоязычной версии может использоваться тестовый материал: Word recognition performance for Northwestern University Auditory Test No. 6 word. В германоязычной версии используется немецкие стандарты на речевые тесты DIN 45626-1-1995,45621-1-1995, 45621-3-1985, и т.п.The above example is for Russian. The English version may use test material: Word recognition performance for Northwestern University Auditory Test No. 6 word. The German language version uses German speech test standards DIN 45626-1-1995, 45621-1-1995, 45621-3-1985, etc.

Рассмотрим пример выполнения динамической адаптации теста на этапе (206).Consider an example of performing dynamic test adaptation in step (206).

Первый тест (первичная ТРП), запускает процедуру проигрывания укороченного фразового теста на уровне громкости «личная беседа» (нормальный уровень громкости). Уровень конкурирующего речевого сигнала «-10» дБ. Тест повторяется 5 раз с автоматическим распознаванием). В соответствии с оценкой качества восприятия речи (Таблица 1) определяется результат стартового теста.The first test (primary TRP) starts the procedure of playing a shortened phrase test at the “personal conversation” volume level (normal volume level). The level of the competing speech signal is "-10" dB. The test is repeated 5 times with automatic recognition). In accordance with the assessment of the quality of speech perception (Table 1), the result of the initial test is determined.

Далее автоматически рассматривается один из трех возможных вариантов:Next, one of three possible options is automatically considered:

- «отлично» - возможен хороший слух. Дальнейшие тесты должны идти по усложнению акустических и когнитивных настроек;- "excellent" - good hearing is possible. Further tests should go on increasing the complexity of the acoustic and cognitive settings;

- «очень плохо» - возможно тяжелое нарушение слуха. Дальнейшие тесты выполнять с упрощением акустических и когнитивных настроек;- "very bad" - possibly severe hearing loss. Perform further tests with simplification of acoustic and cognitive settings;

- промежуточные варианты от «плохо» до «хорошо» рассматриваются как возможные корректируемые нарушения слуха, и проводится наиболее детальное тестирование.- intermediate options from "bad" to "good" are considered as possible correctable hearing impairments, and the most detailed testing is carried out.

Пошаговое тестирование может выполняться далее по следующему сценарию: Шаг 1Step testing can be carried out further in the following scenario: Step 1

Матричный тест, фразы из трех слов, 52 дБ;

Matrix test, three-word phrases, 52 dB;

Пять фраз;

Five phrases;

Розовый шум, 42 дБ.

Pink noise, 42 dB.

0 верных ответов переход к шагу 20 correct answers go to step 2

1-4 верных ответа переход к шагу 31-4 correct answers go to step 3

5 верных ответов - переход к шагу 45 correct answers - go to step 4

Шаг 2Step 2

Двузначные числа, 52 дБ;

Double digits, 52 dB;

Пять чисел;

Five numbers;

Розовый шум, 42 дБ.

Pink noise, 42 dB.

0 верных ответов - переход к шагу 60 correct answers - go to step 6

1-4 верных ответа - переход к шагу 61-4 correct answers - go to step 6

Шаг 3Step 3

Матричный тест, фразы из трех слов, 52 дБ;

Matrix test, three-word phrases, 52 dB;

Пять фраз;

Five phrases;

Розовый шум, 42 дБ.

Pink noise, 42 dB.

0-9 верных ответов в сумме с шагом 1 - переход к шагу 50-9 correct answers in total with step 1 - go to step 5

10 верных ответов - «Существенного снижения слуха не обнаружено», конец сценария Шаг 410 correct answers - "No significant hearing loss", end of scenario Step 4

Матричный тест, фразы из пяти слов, 47 дБ;

Matrix test, five-word phrases, 47 dB;

Три фразы;

Three phrases;

Розовый шум, 42 дБ.

Pink noise, 42 dB.

1-2 верных ответа - переход к шагу 31-2 correct answers - go to step 3

3 верных ответа - «Снижения слуха не обнаружено», конец сценария3 correct answers - "No hearing loss", end of scenario

Шаг 5Step 5

Тест Неймана, 52 дБ;

Neumann test, 52 dB;

Десять слов;

ten words;

Без шума.

No noise.

0-8 верных ответов переход к шагу 60-8 correct answers go to step 6

9-10 верных ответов «Существенного снижения слуха не обнаружено», конец сценария Шаг 69-10 correct answers “No significant hearing loss found”, end of scenario Step 6

Тональная пороговая аудиометрия по воздушной проводимости (125-8000 Гц, автомаскировка);

Tone threshold air conduction audiometry (125-8000 Hz, automasking);

Тональная пороговая аудиометрия по костной проводимости (250-8000 Гц, без маскировки).

Tonal threshold audiometry by bone conduction (250-8000 Hz, without masking).

Анализ осуществляется в соответствие со шкалой оценки качества речевой связи, обеспечивающей правильное истолкование слушателем речевых сообщений, на основании стандартов ИСО 9921:2003* "Эргономика. Оценка речевой связи" (ISO 9921:2003 "Ergonomics - Assessment of speech communication", IDT) и ГОСТ P ИСО 9921-2013.The analysis is carried out in accordance with the scale for assessing the quality of speech communication, which ensures the correct interpretation of speech messages by the listener, based on the standards ISO 9921:2003 * "Ergonomics. Assessment of speech communication" (ISO 9921:2003 "Ergonomics - Assessment of speech communication", IDT) and GOST P ISO 9921-2013.

Также, при проведении тестирования регламентирована и погрешность в распознавании отдельных слов в части процента правильных ответов на распознавание слов и подсчете соответствующей оценки (Таблица 2).Also, during testing, the error in the recognition of individual words is also regulated in terms of the percentage of correct answers to word recognition and the calculation of the corresponding score (Table 2).

В Таблице 3 приведены регламентированные и согласованные типовые уровни громкости речи.Table 3 shows regulated and agreed typical speech loudness levels.

По итогам проведения адаптивного изменения сложности тестирования на этапе (206) с помощью модели машинного обучения на этапе (207) выполняется классификация пользователей (101), характеризующая возможности слухового анализатора человека (101).Based on the results of the adaptive change in the complexity of testing at stage (206), using the machine learning model at stage (207), users are classified (101), which characterizes the capabilities of the human auditory analyzer (101).

Предварительная классификация пользователей (101) позволяет определить в доврачебном порядке основные три группы:Preliminary classification of users (101) allows you to determine the main three groups in a pre-medical order:

1) «Зеленая зона» - участники этой группы показали разборчивость речи (фразы и тестовые слова) в шуме с оценкой не ниже «хорошо».1) "Green Zone" - the participants of this group showed speech intelligibility (phrases and test words) in noise with a score not lower than "good".

Участникам этой группы специализированной медицинской помощи на момент обследования не требуется. Рекомендуется через год пройти повторное экспресс-обследование.Members of this group of specialized medical care at the time of the survey is not required. It is recommended to undergo a second express examination in a year.

2) «Желтая зона» - участники этой группы показали разборчивость речи (фразы и тестовые слова) на среднем уровне (недостаточная речевая адаптация). Возможный уровень снижения слуха на базовых речевых частотах по воздушной проводимости более 25 дБ, что означает возможное значимое снижение слуха. Впервые в автоматическом варианте выполнен совместный анализ аудиограмм по воздушной и костной проводимости на предмет выявления характера нарушений: кондуктивный, сенсоневральный или смешенный тип нарушения. Анализ выполнен по тем диапазонам:2) "Yellow zone" - the participants of this group showed speech intelligibility (phrases and test words) at an average level (insufficient speech adaptation). The possible level of hearing loss at basic speech frequencies by air conduction is more than 25 dB, which means a possible significant hearing loss. For the first time in the automatic version, a joint analysis of audiograms by air and bone conduction was performed to identify the nature of the violation: conductive, sensorineural or mixed type of violation. The analysis was performed for the following ranges:

низкочастотный (125 Гц - 1500 Гц);

low frequency (125 Hz - 1500 Hz);

основные речевые частоты (500 Гц - 4000 Гц);

main speech frequencies (500 Hz - 4000 Hz);

высокочастотный (3000 Гц - 8000 Гц)

high frequency (3000 Hz - 8000 Hz)

Участники этой группы должны обследоваться более детально для последующей коррекции слуха и реабилитацииParticipants in this group should be examined in more detail for subsequent hearing correction and rehabilitation.

Участникам этой группы требуется плановое посещение специалиста для уточнения результатов (в случае необходимости), постановки диагноза и разработки плана необходимых мероприятий по коррекции слуховых возможностей.Members of this group require a scheduled visit to a specialist to clarify the results (if necessary), make a diagnosis and develop a plan of necessary measures to correct hearing abilities.

3) «Красная зона» - участники этой группы показали уровень разборчивости речи (тестовые слова) в шуме на уровне «плохо» и «очень плохо» и показатели возможного снижения слуха при проведении скриннинговой аудиометрии, указывающие на тяжелую потерю слуха или глухоту.3) "Red Zone" - participants in this group showed the level of speech intelligibility (test words) in noise at the level of "bad" and "very bad" and indicators of possible hearing loss during screening audiometry, indicating severe hearing loss or deafness.

Данным участникам необходимо срочное обследование у врача-оториноларинголога для определения направления более детального врачебно-диагностического обследования.These participants need an urgent examination by an otorhinolaryngologist to determine the direction of a more detailed medical diagnostic examination.

При проведении исследований, с помощью предложенного ПАК (100) формируется общий список проведенных сеансов в одном файле, который может быть передан на ПК по почте и загружен в Excel. Строка формируется только в том случае, если полностью выполнены все этапы сеанса.When conducting research, with the help of the proposed PAK (100), a general list of the sessions performed is formed in one file, which can be transferred to a PC by mail and loaded into Excel. The line is formed only if all the stages of the session are fully completed.

Каждая строка в файле содержит:Each line in the file contains:

Идентификатор вычислительного устройства (110);

Computing device identifier (110);

Дату и время сохранения файла полного протокола;

Date and time when the full protocol file was saved;

Имя файла полного протокола;

Full log file name;

Почтовый индекс места проведения сеанса;

Postal code of the session location;

Пол пациента;

gender of the patient;

Возраст;

Age;

Группа срочности (зеленая, желтая или красная);

Urgency group (green, yellow or red);

Разборчивость фраз;

intelligibility of phrases;

Разборчивость тестовых слов;

legibility of test words;

Скрининговая аудиометрия:

Screening audiometry:

Степень снижения слуха по воздушной проводимости для правого и левого уха из списка:

The degree of hearing loss by air conduction for the right and left ear from the list:

Нет данных;

There is no data;

Норма;

Norm;

1 степень;

1 degree;

2 степень;

2 degree;

3 степень;

3 degree;

4 степень;

4 degree;

Глухота;

Deafness;

Вид тугоухости по речевым частотам для правого и левого уха из списка:

Type of hearing loss by speech frequencies for the right and left ear from the list:

Нет данных;

There is no data;

Норма;

Norm;

Кондуктивная;

Conductive;

Нейросенсорная;

Neurosensory;

Смешанная.

Mixed.

Привязка результатов к географическому пункту (почтовый индекс, адрес или координата навигатора) позволяют объединять результаты, полученные в соседних регионах для последующей аналитической обработки.Linking the results to a geographic location (zip code, address or navigator coordinates) allows you to combine the results obtained in neighboring regions for further analytical processing.

При наличии репрезентативной выборки можно прогнозировать кадровые и финансовые потребности как по регионам, так и в целом по стране, оценивать эффективность проведенных организационно-технических мероприятий по улучшению коммуникативных возможностей населения конкретных регионов.If a representative sample is available, it is possible to predict the personnel and financial needs both by region and the country as a whole, and evaluate the effectiveness of the organizational and technical measures taken to improve the communication capabilities of the population of specific regions.

В соответствии с международной статистикой ВОЗ по распространенности снижения слуха в «зеленую зону» можно ожидать попадания не менее 75-80% участников испытательного осмотра, что существенно сократит нагрузку на имеющихся специалистов.In accordance with the WHO international statistics on the prevalence of hearing loss, at least 75-80% of test examination participants can be expected to fall into the "green zone", which will significantly reduce the burden on existing specialists.

Заявленное решение позволяет реализовать практическую возможность сравнительного анализа результатов оценки разборчивости речи по двум физиологически различным путям передачи звукового сигнала к улитке: по воздушной и костной проводимости.The claimed solution makes it possible to realize the practical possibility of a comparative analysis of the results of assessing speech intelligibility in two physiologically different ways of transmitting an audio signal to the cochlea: through air and bone conduction.

Разница в результатах речевой аудиометрии по воздушной и костной проводимости может позволить дифференцировать кондуктивную и сенсоневральную тугоухость, не прибегая к сложным аудиметрическим исследованиям.The difference in the results of speech audiometry for air and bone conduction may allow differentiating conductive and sensorineural hearing loss without resorting to complex audiometric studies.

Учитывая «квалификационную» специфику проводимого обследования - доврачебный метод без привлечения специалистов по сурдологии и оториноларингологии, крайне важно обеспечить адекватный уровень достоверности данных по измерению пороговых значений слышимости тестовых сигналов. Одновременно со всеми тестами автоматически проверяется и фиксируется уровень окружающего шума.Taking into account the “qualifying” specifics of the examination being carried out - a pre-medical method without the involvement of specialists in audiology and otorhinolaryngology, it is extremely important to ensure an adequate level of reliability of data on measuring the threshold values of audibility of test signals. Simultaneously with all tests, the level of ambient noise is automatically checked and recorded.

В настоящем решении разработан и реализован алгоритм семантического анализа аудиограмм, прошедших предварительный анализ на достоверность и непротиворечивость, который совместно с результатами анализа тестов на разборчивость речи позволил более достоверно отнести каждого испытуемого к одному из трех классов: «зеленый», «желтый», «красный».In this solution, an algorithm for the semantic analysis of audiograms that have passed a preliminary analysis for reliability and consistency has been developed and implemented, which, together with the results of the analysis of speech intelligibility tests, made it possible to more reliably attribute each subject to one of three classes: "green", "yellow", "red". ".

По итогам проведенного тестирования по каждому пациенту автоматически готовится детальный протокол условий тестирования, описания использованных тестовых материалов и реальные ответы пациентов с автоматической оценкой правильности ответа, используя алгоритмы машинного обучения. Пример протоколов представлены на Фиг. 3А-3К. Сформированный по всему массиву обследуемых пациентов комплект протоколов в цифровом виде может быть отправлен, например, в региональную систему «телемедицина» или иную организацию.Based on the results of the testing, a detailed protocol of testing conditions, descriptions of the test materials used and real answers of patients with automatic assessment of the correctness of the answer using machine learning algorithms are automatically prepared for each patient. An example protocol is shown in Fig. 3A-3K. A set of digital protocols generated for the entire array of examined patients can be sent, for example, to the regional telemedicine system or another organization.

На Фиг. 4 представлен общий пример компьютерного устройства (300), которое может применяться для реализации устройств, входящих в ПАК (100), например, вычислительного устройства (110). В общем случае устройство (300) содержит такие компоненты, как: один или более процессоров (301), по меньшей мере одну оперативную память (302), средство постоянного хранения данных (303), интерфейсы ввода/вывода (304), средство В/В (305), средства сетевого взаимодействия (306).On FIG. 4 shows a general example of a computing device (300) that can be used to implement the devices included in the HAC (100), such as a computing device (110). In general, the device (300) contains components such as: one or more processors (301), at least one random access memory (302), persistent data storage (303), input/output interfaces (304), I/O At (305), networking tools (306).

Процессор (301) устройства выполняет основные вычислительные операции, необходимые для функционирования устройства (300) или функционала одного или более его компонентов. Процессор (301) исполняет необходимые машиночитаемые команды, содержащиеся в оперативной памяти (302).The processor (301) of the device performs the basic computing operations necessary for the operation of the device (300) or the functionality of one or more of its components. The processor (301) executes the necessary machine-readable instructions contained in the main memory (302).

Память (302), как правило, выполнена в виде ОЗУ и содержит необходимую программную логику, обеспечивающую требуемый функционал. Средство хранения данных (303) может выполняться в виде HDD, SSD дисков, рейд массива, сетевого хранилища, флэш-памяти, оптических накопителей информации (CD, DVD, MD, Blue-Ray дисков) и т.п. Средство (303) позволяет выполнять долгосрочное хранение различного вида информации, например, истории обработки запросов (логов), идентификаторов пользователей, звуковые файлы и т.п.The memory (302) is typically in the form of RAM and contains the necessary software logic to provide the required functionality. The data storage means (303) can be in the form of HDD, SSD disks, raid array, network storage, flash memory, optical storage media (CD, DVD, MD, Blue-Ray disks), etc. The tool (303) allows long-term storage of various types of information, such as request processing history (logs), user IDs, audio files, and the like.

Интерфейсы (304) представляют собой стандартные средства для подключения и работы различного вида устройств (300), например, USB, RS232, RJ45, LPT, COM, HDMI, PS/2, Lightning, FireWire и т.п. Выбор интерфейсов (304) зависит от конкретного исполнения устройства (300), которое может представлять собой персональный компьютер, мейнфрейм, серверный кластер, тонкий клиент, смартфон, ноутбук и т.п.Interfaces (304) are standard means for connecting and operating various types of devices (300), such as USB, RS232, RJ45, LPT, COM, HDMI, PS/2, Lightning, FireWire, etc. The choice of interfaces (304) depends on the specific implementation of the device (300), which can be a personal computer, mainframe, server cluster, thin client, smartphone, laptop, and the like.

В качестве средств В/В данных (305) может использоваться: клавиатура, джойстик, дисплей (сенсорный дисплей), проектор, тачпад, манипулятор мышь, трекбол, световое перо, динамики, микрофон и т.п.As means of I/O data (305) can be used: a keyboard, a joystick, a display (touchscreen), a projector, a touchpad, a mouse, a trackball, a light pen, speakers, a microphone, and the like.

Средства сетевого взаимодействия (306) выбираются из устройства, обеспечивающий сетевой прием и передачу данных, например, Ethernet карту, WLAN/Wi-Fi модуль, Bluetooth модуль, BLE модуль, NFC модуль, IrDa, RFID модуль, GSM модем и т.п.С помощью средства (306) обеспечивается организация обмена данными по проводному или беспроводному каналу передачи данных, например, WAN, PAN, ЛВС (LAN), Интранет, Интернет, WLAN, WMAN или GSM.Means of networking (306) are selected from a device that provides network data reception and transmission, for example, an Ethernet card, WLAN/Wi-Fi module, Bluetooth module, BLE module, NFC module, IrDa, RFID module, GSM modem, etc. With the help of the tool (306) the organization of data exchange over a wired or wireless data transmission channel, for example, WAN, PAN, LAN (LAN), Intranet, Internet, WLAN, WMAN or GSM, is provided.

Компоненты устройства (300), как правило, сопряжены посредством общей шины передачи данных или посредством любого другого типа связи, обеспечивающего взаимодействие элементов устройства (300).The components of the device (300) are typically connected via a common data bus or any other type of communication that allows the elements of the device (300) to interact.

В настоящих материалах заявки было представлено предпочтительное раскрытие осуществление заявленного технического решения, которое не должно использоваться как ограничивающее иные, частные воплощения его реализации, которые не выходят за рамки испрашиваемого объема правовой охраны и являются очевидными для специалистов в соответствующей области техники.In these application materials, a preferred disclosure of the implementation of the claimed technical solution was presented, which should not be used as limiting other, private embodiments of its implementation, which do not go beyond the scope of the requested legal protection and are obvious to specialists in the relevant field of technology.

Claims

1. A method for pre-medical preliminary classifying multifactorial assessment of the possibility of a human auditory analyzer during mass preventive examinations of the population, performed using a computing device connected to audio signal playback devices, and comprising the steps of:

using a computing device

- form a primary test speech sequence (PTS), which is a sentence consisting of the first number of words based on the matrix test;

- form a noise competing sound for the primary TRP;

- primary TRP is reproduced using audio signal playback devices made in the form of air and bone conduction headphones, while TRP playback is performed simultaneously with the noise competing sound at the first signal-to-noise ratio using speech simulation based on a deep machine learning model;

- receive an oral response from the user;

- perform automatic analysis of the user's oral response to recognize the TRP by converting it into text form and analyzing the correctness of the answer using a machine learning model;

and

- based on the analysis of the user's oral responses, a dynamic change in the complexity of the assessment is carried out, in which, as a result of each automatic analysis performed, the number of words in the sentences forming the TRP and / or the signal-to-noise ratio of the reproduced signal is changed;

- perform an assessment of the capabilities of the user's auditory analyzer based on the responses when reproducing the test speech sequence.

2. The method according to claim 1, characterized in that when analyzing the user's oral responses, the number of correct word matches in the primary TRP is determined.

3. The method according to claim 2, characterized in that the reduction in the number of words in the TRP is performed, provided that the number of correct matches is below a predetermined threshold value.

4. The method according to claim 2, characterized in that the volume is reduced in the TRP, provided that the number of correct matches is above a predetermined threshold value.

5. The method according to claim 2, characterized in that the amplification of the noise signal in the TRP is performed, provided that the number of correct matches is higher than the set threshold value.

6. Hardware-software complex (HSC) for pre-medical preliminary classifying multifactorial assessment of the possibility of a human auditory analyzer during mass preventive examinations of the population, containing a computing device and means of reproducing audio signals, in which:

the computing device is configured to

- forming a test speech sequence in the form of sentences consisting of the first number of words with a given volume level and in a given language based on a matrix test that supports the selected language, and the reproduction of the test speech sequence is carried out using speech simulation based on a deep machine learning model;

- selecting a noise competing sound for said test speech sequence of words;

- receiving an oral response from the user;

- automatic analysis of the user's oral response by recognizing each word and / or phrase of the test sequence of words with its transformation into a text form and analyzing the correctness of the answer;

- changes in the generated sentences, consisting of the second number of words, less than the value of the first number, in the case when the number of recognized words in the reproduced sentences is less than a specified threshold value;

- assessment of the capabilities of the user's auditory analyzer based on the responses during the reproduction of the test speech sequence;

audio signal playback devices made in the form of air and bone conduction headphones, with the help of which the test speech sequence is reproduced, while the test speech sequence is reproduced in the form of phrases from the matrix test or individual words, accompanied by a competing noise signal.