RU2717911C1

RU2717911C1 - Method of training a convolution neural network to carry out teleroentgenogram markings in a direct and lateral projections

Info

Publication number: RU2717911C1
Application number: RU2019124849A
Authority: RU
Inventors: Александр Александрович Мураев; Илья Алексеевич Кибардин; Николай Юрьевич Оборотистов; Полина Александровна Мураева
Original assignee: Общество с Ограниченной Ответственностью "Цифровые Технологии в Хирургии"
Priority date: 2019-08-06
Filing date: 2019-08-06
Publication date: 2020-03-26
Also published as: WO2021025581A1

Abstract

FIELD: computer equipment.

SUBSTANCE: invention relates to computer engineering and medicine. Method comprises steps of: creating a database (DB) with pre-analyzed and marked cephalometric points and their coordinates on teleroentgenograms in direct and lateral projections (TRGD and TRGL); based on the prepared DB, training at least one convolutional neural network (CNN) is marked with cephalometric points of the teleroentgenograms in the direct and lateral projections (TRGD and TRGL), wherein: the CNN input is fed with images from the DB, with points N marked thereon, and multilayer mask M for said points; CNN analyzes multilayer masks M and places them on N masks M_1 …M_N by number of marked points on input images; finding connectivity components for each layer of masks, wherein each component is predicted coordinates of points on images position; comparing predicted coordinates with coordinates of marked points on input images from DB and calculating root-mean-square deviation; trained at least one CNN is used for further marking with cephalometric points of teleroentgenogram in direct and lateral projections.

EFFECT: provision for training of a convolution neural network to carry out marking of teleroentgenograms in direct and lateral projections.

5 cl, 2 dwg

Description

ОБЛАСТЬ ТЕХНИКИFIELD OF TECHNOLOGY

Настоящее техническое решение относится к области вычислительной техники и медицины, в частности, к способу обучения свёрточной нейронной сети осуществлять разметки телерентгенограмм в прямой и боковой проекциях.The present technical solution relates to the field of computer engineering and medicine, in particular, to a method for training a convolutional neural network to carry out marking of tele-roentgenograms in front and side projections.

УРОВЕНЬ ТЕХНИКИBACKGROUND

Антропометрический анализ лицевого отдела черепа и мягких тканей лица широко используется в ортодонтии и в челюстно-лицевой хирургии, являясь одним из основных диагностических инструментов при постановке диагноза и выборе плана лечения [1][2][3]. Кроме того, антропометрические подходы изучения мягких и твёрдых тканей применяются в травматологии, антропологии, археологии и судебной медицине [4].Anthropometric analysis of the facial section of the skull and soft tissues of the face is widely used in orthodontics and maxillofacial surgery, being one of the main diagnostic tools for diagnosing and choosing a treatment plan [1] [2] [3]. In addition, anthropometric approaches to the study of soft and hard tissues are used in traumatology, anthropology, archeology and forensic medicine [4].

При этом телерентгенография (ТРГ) является важным методом исследования в ортодонтии и челюстно-лицевой хирургии, позволяющим получить необходимую диагностическую информацию о строении мозгового и лицевого отделов черепа для планирования лечения. Существует множество методов анализа, требующих расстановки цефалометрических точек на телерентгенограмме с последующей их обработкой, что занимает значительное время врача. At the same time, tele-roentgenography (TRH) is an important research method in orthodontics and maxillofacial surgery, which allows obtaining the necessary diagnostic information about the structure of the brain and facial parts of the skull for planning treatment. There are many methods of analysis that require the placement of cephalometric points on a teleradiogram with subsequent processing, which takes a considerable amount of time from the doctor.

Из уровня техники известны решения, описывающее различные методы расстановки цефалометрических точек для антропометрического анализа, US20180189421A1, US10117727B2.The prior art solutions that describe various methods for placing cephalometric points for anthropometric analysis, US20180189421A1, US10117727B2.

Однако, известные из уровня техники решения имеют ограниченную функциональность. В частности, в них не применяются искусственные нейронные сети для разметки телерентгенограмм в прямой и боковой проекциях.However, prior art solutions have limited functionality. In particular, they do not use artificial neural networks for marking tele-roentgenograms in direct and lateral projections.

В настоящее время искусственные нейронные сети являются важным инструментом для решения многих прикладных задач. Они уже позволили справиться с рядом непростых проблем и обещают создание новых изобретений, способных решать задачи, которые пока под силу только человеку. Искусственные нейронные сети, также, как и биологические, представляют собой системы, состоящие из огромного количества функционирующих процессоров-нейронов, каждый из которых выполняет какой-либо небольшой объем работ, возложенный на него, при этом обладая большим числом связей с остальными, что и характеризует силу вычислений сети.Currently, artificial neural networks are an important tool for solving many applied problems. They have already allowed to cope with a number of difficult problems and promise the creation of new inventions that can solve problems that so far only human can do. Artificial neural networks, as well as biological ones, are systems consisting of a huge number of functioning neuron processors, each of which performs some small amount of work assigned to it, while having a large number of connections with the rest, which characterizes power of network computing.

Применение искусственного интеллекта позволяют усовершенствовать метод исследования и значительно упростить работу врача. В заявленном решении была разработана конфигурация искусственной нейронной сети (ИНС), позволяющая с высокой точностью расставлять цефалометрические точки на телерентгенограмм в прямой и боковой проекциях. Ошибка ИНС в заявленном решении составила всего 0.5%.The use of artificial intelligence can improve the research method and greatly simplify the work of the doctor. In the claimed solution, a configuration of an artificial neural network (ANN) was developed, which makes it possible to place cephalometric points on tele-roentgenograms in direct and lateral projections with high accuracy. The ANN error in the claimed solution was only 0.5%.

Предложенный подход требует в 2-5 раз быстрее, чем традиционный «ручной» метод расстановки цефалометрических точек, в зависимости от количества точек и сложности цефалометрического анализа.The proposed approach requires 2-5 times faster than the traditional “manual” method of arranging cephalometric points, depending on the number of points and the complexity of the cephalometric analysis.

СУЩНОСТЬ ИЗОБРЕТЕНИЯSUMMARY OF THE INVENTION

Данное техническое решение направлено на устранение недостатков, присущих существующим решениям из известного уровня техники.This technical solution is aimed at eliminating the disadvantages inherent in existing solutions from the prior art.

Технической проблемой, на решение которой направлено заявленное техническое решение, является создание компьютерно-реализуемого способа обучения свёрточной нейронной сети осуществлять разметки телерентгенограмм в прямой и боковой проекциях. Дополнительные варианты реализации настоящего изобретения представлены в зависимых пунктах изобретения.The technical problem to which the claimed technical solution is directed is the creation of a computer-implemented method for training the convolutional neural network to carry out the marking of X-ray patterns in front and side projections. Additional embodiments of the present invention are presented in the dependent claims.

Технический результат настоящего решения заключается в обеспечении обучения свёрточной нейронной сети осуществлять разметки телерентгенограмм в прямой и боковой проекцияхThe technical result of this decision is to provide training for the convolutional neural network to carry out the marking of tele-roentgenograms in the front and side projections

Дополнительным техническим результатом, достигающимся при решении вышеуказанной задачи, является повышение качества и точности разметки телерентгенограмм в прямой и боковой проекциях пациента обученной сверточной нейронной сетью. An additional technical result achieved in solving the above problem is to improve the quality and accuracy of the marking of teleretgenograms in the direct and lateral projections of the patient by a trained convolutional neural network.

В предпочтительном варианте реализации заявлен компьютерно-реализуемый способ обучения свёрточной нейронной сети (СНС) осуществлять разметки телерентгенограмм в прямой и боковой проекциях, содержащий этапы, на которых:In a preferred embodiment, a computer-implemented method for training a convolutional neural network (SNA) for marking tele-roentgenograms in front and side projections, comprising the steps in which:

- с помощью вычислительного устройства, создают базу данных (БД) с предварительно проанализированными и размеченными цефалометрическими точками и их координатами на телерентгенограммах в прямой и боковой проекциях (ТРГП и ТРГБ); - using a computing device, create a database (DB) with pre-analyzed and marked cephalometric points and their coordinates on tele-roentgenograms in the front and side projections (TRGP and TRGB);

- на основе подготовленной БД обучают, по меньшей мере, одну СНС размечать цефалометрическими точками телерентгенограммы в прямой и боковой проекциях (ТРГП и ТРГБ), при этом во время обучения:- on the basis of the prepared database, they teach at least one SNA to mark the tele-roentgenograms in frontal and lateral projections (TRHP and TRGB) with cephalometric points, while during training:

• на вход СНС подают изображения из БД, с размеченными на них точками N, и многослойные карты вероятностей (маски) M для этих точек;• images from the database with points N marked on them and multilayer probability maps (masks) M for these points are fed to the SNA input;

• СНС анализирует многослойные маски M и расклеивает их на N масок M_1…M_N по количеству размеченных точек на входных изображениях; • SNA analyzes multilayer masks M and sticks them onto N masks M_1 ... M_N according to the number of marked points on the input images;

• находят компоненты связности для каждого слоя масок, при этом каждая компонента является предсказанными координатами положения точек на изображениях;• find connected components for each layer of masks, with each component being the predicted coordinates of the points in the images;

• сравнивают предсказанные координаты с координатами размеченных точек на входных изображениях из БД и рассчитывают среднеквадратичное отклонение;• compare the predicted coordinates with the coordinates of the marked points on the input images from the database and calculate the standard deviation;

- применяют обученную, по меньшей мере, одну СНС для последующей разметки цефалометрическими точками телерентгенограммы в прямой и боковой проекциях.- apply the trained at least one SNA for subsequent marking by cephalometric points of the teleradiogram in direct and lateral projections.

В частном варианте телерентгенограммы в БД сохраняются в виде xml-файла.In a particular embodiment, the tele-roentgenograms in the database are saved as an xml file.

В другом частном варианте осуществляют предобработку изображений телерентгенограмм перед их подачей в СНС методом аугментации.In another particular embodiment, the pre-processing of images of teleradiographs is carried out before they are submitted to the SNA by augmentation.

В другом частном варианте компоненты связности находят с помощью функции skimage.morphology.label.In another particular embodiment, connectivity components are found using the skimage.morphology.label function.

В другом частном варианте цефалометрические точки расставляют на таких локализациях, как: мягкотканых, накостных и зубных тканях.In another particular embodiment, cephalometric points are placed on such localizations as soft tissue, bone and dental tissues.

ОПИСАНИЕ ЧЕРТЕЖЕЙDESCRIPTION OF DRAWINGS

Реализация изобретения будет описана в дальнейшем в соответствии с прилагаемыми чертежами, которые представлены для пояснения сути изобретения и никоим образом не ограничивают область изобретения. К заявке прилагаются следующие чертежи:The implementation of the invention will be described hereinafter in accordance with the accompanying drawings, which are presented to illustrate the essence of the invention and in no way limit the scope of the invention. The following drawings are attached to the application:

Фиг. 1 иллюстрирует схему заявленного решения;FIG. 1 illustrates a diagram of a claimed solution;

Фиг. 2 иллюстрирует пример общей схемы компьютерного устройства.FIG. 2 illustrates an example of a general circuit of a computer device.

ДЕТАЛЬНОЕ ОПИСАНИЕ ИЗОБРЕТЕНИЯDETAILED DESCRIPTION OF THE INVENTION

В приведенном ниже подробном описании реализации изобретения приведены многочисленные детали реализации, призванные обеспечить отчетливое понимание настоящего изобретения. Однако, квалифицированному в предметной области специалисту, будет очевидно каким образом можно использовать настоящее изобретение, как с данными деталями реализации, так и без них. В других случаях хорошо известные методы, процедуры и компоненты не были описаны подробно, чтобы не затруднять понимание особенностей настоящего изобретения.In the following detailed description of the implementation of the invention, numerous implementation details are provided to provide a clear understanding of the present invention. However, to a person skilled in the art, it will be obvious how the present invention can be used, both with and without implementation details. In other cases, well-known methods, procedures, and components have not been described in detail so as not to obscure the features of the present invention.

Кроме того, из приведенного изложения будет ясно, что изобретение не ограничивается приведенной реализацией. Многочисленные возможные модификации, изменения, вариации и замены, сохраняющие суть и форму настоящего изобретения, будут очевидными для квалифицированных в предметной области специалистов.In addition, from the foregoing it will be clear that the invention is not limited to the above implementation. Numerous possible modifications, changes, variations and replacements preserving the essence and form of the present invention will be apparent to those skilled in the subject field.

Настоящее изобретение направлено на обеспечение компьютерно-реализуемого способа обучения свёрточной нейронной сети, с последующим использованием этой обученной СНС для разметки телерентгенограмм в прямой и боковой проекциях.The present invention is directed to providing a computer-implemented method for training a convolutional neural network, with the subsequent use of this trained SNA for marking tele-roentgenograms in front and side projections.

Искусственная нейронная сеть (далее - ИНС) - вычислительная или логическая схема, построенная из однородных процессорных элементов, являющихся упрощенными функциональными моделями нейронов. An artificial neural network (hereinafter - ANN) is a computational or logical circuit constructed from homogeneous processor elements, which are simplified functional models of neurons.

Свёрточная нейронная сеть (СНС) (англ. convolutional neural network, CNN) - специальная архитектура искусственных нейронных сетей, нацеленная на эффективное распознавание образов, и входит в состав технологий глубокого обучения (англ. deep learning).The convolutional neural network (CNN) is a special architecture of artificial neural networks aimed at efficient pattern recognition, and is part of deep learning technologies.

Количество слоев искусственной нейронной сети не ограничено вариантами реализации. В качестве обученной нейронной сети могут использовать полносвязную нейронную сеть, или сверточную нейронную сеть, или рекуррентную нейронную сеть или их комбинацию, не ограничиваясь.The number of layers of an artificial neural network is not limited by implementation options. As a trained neural network, a fully connected neural network, or a convolutional neural network, or a recurrent neural network, or a combination thereof, can be used, not limited to.

Слой нейронной сети (англ. layer) - совокупность нейронов сети, объединяемых по особенностям их функционирования.Layer of a neural network (English layer) - a set of neurons of a network, united by the features of their functioning.

Как представлено на Фиг. 1, заявленный компьютерно-реализуемый обучения свёрточной нейронной сети осуществлять разметки телерентгенограмм в прямой и боковой проекциях (100) реализован следующим образом:As shown in FIG. 1, the claimed computer-implemented training of the convolutional neural network to carry out the marking of tele-roentgenograms in the front and side projections (100) is implemented as follows:

На этапе (101) с помощью вычислительного устройства, создают базу данных (БД) с предварительно проанализированными и размеченными цефалометрическими точками и их координатами на телерентгенограммах в прямой и боковой проекциях (ТРГП и ТРГБ).At step (101), using a computing device, a database (DB) is created with previously analyzed and marked cephalometric points and their coordinates on the tele-roentgenograms in the front and side projections (TRGP and TRGB).

Далее на этапе (102) на основе подготовленной БД обучают, по меньшей мере, одну СНС размечать цефалометрическими точками телерентгенограммы в прямой и боковой проекциях (ТРГП и ТРГБ), при этом во время обучения:Next, at step (102), on the basis of the prepared database, at least one SNA is trained to be marked with cephalometric points of the teleradiograms in the direct and lateral projections (TRGP and TRGB), while during training:

На этапе (103) на вход СНС подают изображения из БД, с размеченными на них точками N, и многослойные маски M для этих точек.At step (103), images from the database with the points N marked on them and multilayer masks M for these points are fed to the SNA input.

На этапе (104) СНС анализирует многослойные маски M и расклеивает их на N масок M_1…M_N по количеству размеченных точек на входных изображениях. Далее на этапе (105) находят компоненты связности для каждого слоя масок, при этом каждая компонента является предсказанными координатами положения точек на изображениях.At step (104), the SNA analyzes the multi-layer masks M and sticks them onto N masks M_1 ... M_N according to the number of marked points on the input images. Next, at step (105), connectivity components are found for each layer of masks, with each component being the predicted coordinates of the points in the images.

После чего на этапе (106) сравнивают предсказанные координаты с координатами размеченных точек на входных изображениях из БД и рассчитывают среднеквадратичное отклонение. Применяют обученную, по меньшей мере, одну СНС для последующей разметки цефалометрическими точками телерентгенограммы в прямой и боковой проекциях (107).Then, at step (106), the predicted coordinates are compared with the coordinates of the marked points on the input images from the database and the standard deviation is calculated. A trained at least one SNA is used for subsequent marking by cephalometric points of a teleradiogram in a direct and lateral projection (107).

В заявленном решении телерентгенограммы в БД сохраняются в виде xml-файла. In the claimed solution, the tele-roentgenograms in the database are saved as an xml file.

СНС может быть обучена на наборе объектов, которые, например, представляют собой изображения. SNA can be trained on a set of objects, which, for example, are images.

Для обучения СНС по ТРГБ были использованы 64 первичных (основных) и 98 вторичных (дополнительных) точек. Для обучения СНС по ТРГП были использованы 49 первичных (основных) и 12 вторичных (дополнительных) точек. Предлагаемый метод обучения СНС не имеет ограничение по количеству цефалометрических точек. Для первичной тренировки ИНС было взято 130 снимков каждого типа (ТРГП и ТРГБ), а оставшиеся 25 были оставлены для валидации. Снимки были получены из разных источников, тем самым обеспечивая адаптивность обученного алгоритма к новым данным.For training the SNA on TRGB, 64 primary (primary) and 98 secondary (additional) points were used. To teach the SNA on TRHP, 49 primary (primary) and 12 secondary (additional) points were used. The proposed method for teaching the SNA does not have a limit on the number of cephalometric points. For the initial training of the ANN, 130 images of each type were taken (TRGP and TRGB), and the remaining 25 were left for validation. Pictures were obtained from various sources, thereby ensuring the adaptability of the trained algorithm to new data.

На основе первично обученной СНС создается БД с размеченными цефалометрическими точками и их координатами на телерентгенограммах в прямой и боковой проекциях, по меньшей мере 1000 телерентгенограмм каждого вида.On the basis of the initially trained SNA, a database is created with marked cephalometric points and their coordinates on the tele-roentgenograms in the front and side projections of at least 1000 tele-roentgenograms of each type.

На основе подготовленных БД обучаются СНС для расстановки точек с использованием метода, основанного на мультиклассовой сегментации изображений.Based on the prepared databases, SNAs are trained for dotting using a method based on multiclass image segmentation.

На вход СНС подают изображения из БД, с размеченными на них точками N, и многослойные маски M для этих точек. Маски строятся с использованием метода OpenPose [5], в части, берется плотность двумерного нормального распределения с матожиданием - самой точкой. Матрица ковариаций подбирается так, чтобы точки сильно не перекрывались друг другом, при отрисовке, но были довольно заметны.At the input of the SNA, images from the database are supplied, with points N marked on them, and multilayer masks M for these points. Masks are constructed using the OpenPose method [5], in part, the density of the two-dimensional normal distribution is taken with the expectation - the point itself. The covariance matrix is selected so that the points do not overlap much when drawn, but are quite noticeable.

Эта плотность нормируется так, чтобы в максимуме была единица, так мы получаем аналог вероятности. Далее подбираются вероятности, выше которых маскам присваивается 1, а ниже - 0. При генерации масок 512x512 в качестве матрицы ковариаций была взята diag (4, 4). Это диагональная матрица размера 2x2 у которой на диагонали стоят четверки, то есть мтарица:This density is normalized so that there is one at the maximum, so we get an analog of probability. Then the probabilities are selected, above which the masks are assigned 1, and below - 0. When generating the 512x512 masks, diag was taken as the covariance matrix (4, 4). This is a 2x2 diagonal matrix with fours on the diagonal, i.e. mtaritsa:

(4 0)(4 0)

(0 4).(0 4).

После чего СНС учится предсказывать эти маски. Затем, все предсказания, которые больше, либо равны 0.5 делаются 1, а остальные зануляются. С помощью функции skimage.morphology.label находятся компоненты связности для каждого слоя маски. Далее выбирается наибольшая, которая не состоит из 0, и находится ее центр. Этот центр и является положением искомой точки. After which the SNA learns to predict these masks. Then, all predictions that are greater than or equal to 0.5 are made 1, and the rest are zeroed out. Using the skimage.morphology.label function, connectivity components are found for each layer of the mask. Next, select the largest, which does not consist of 0, and its center is located. This center is the position of the desired point.

Ниже приведена архитектура СНС.The following is the architecture of the SNA.

Реализация СНС полностью основана на Pytorch. Это наиболее гибкий и удобный фреймворк для работы с нейронными сетями. The implementation of the SNA is entirely based on Pytorch. This is the most flexible and convenient framework for working with neural networks.

Для расстановки первичных и вторичных точек были использованы нейронные сети с одинаковой, с точностью до числа классов, архитектурой. В каждом случае использовались сразу две нейронные сети. Первой сетью была SE_ResNeXt-50 [6], но для повышения точности предсказаний была использована FPN, измененная для мультиклассовой сегментации [7]. Это решение обусловлено тем, что классов очень много, а сети, декодеры которых основаны на Unet-архитектуре предназначены изначально для бинарной сегментации. Также Unet работают и с небольшим количеством классов. Энкодер СНС был предобучена на датасете ImageNet.To arrange the primary and secondary points, neural networks were used with the same architecture, accurate to the number of classes. In each case, two neural networks were used at once. The first network was SE_ResNeXt-50 [6], but FPN modified for multiclass segmentation was used to increase the accuracy of predictions [7]. This decision is due to the fact that there are a lot of classes, and networks whose decoders are based on the Unet architecture are intended initially for binary segmentation. Unet also works with a small number of classes. The SNS encoder was pre-trained on the ImageNet dataset.

Изначально изображения и маски имели размер 384x384. На тренировке из них вырезалось случайное окно размером 256x256, так как такая архитектура сети предполагает сильные аугментации. Также были использованы следующие случайные аугменатции:Initially, images and masks had a size of 384x384. In training, a random 256x256 window was cut out of them, since such a network architecture implies strong augmentations. The following random augmentations were also used:

1. Случайное изменение яркости;1. Random change in brightness;

2. Искажение по сетке;2. Distortion over the grid;

3. Случайное изменение контраста;3. Random change in contrast;

4. Случайное изменение гамма;4. Random change in gamma;

5. Случайное изменение масштаба (от 0.8 до 1.2);5. Random zoom (from 0.8 to 1.2);

6. Случайный поворот на 8 градусов.6. Random rotation of 8 degrees.

Границы заполнялись черным цветом. Далее данные собирались в серии по 64 картинки и отправлялись на тренировку.The borders were filled in black. Next, the data was collected in a series of 64 pictures and sent for training.

Для СНС для регрессии использовались картинки размером 240x240, которые на тренировке случайным окном обрезались до 224x224, что является стандартным размером входного изображения для сетей вида ResNet. Аугментации были следующие:For the SNA, regression was performed using 240x240-sized images, which were trimmed to 224x224 by a random window during training, which is the standard input image size for networks of the ResNet type. The augmentations were as follows:

1. Случайное размытие; 1. Random blur;

2. Случайное изменение яркости;2. Random change in brightness;

3. Случайное изменение масштаба (от 0.95 до 1.11);3. Random zoom (from 0.95 to 1.11);

4. Случайный поворот на 10 градусов.4. Random rotation of 10 degrees.

Далее изображения подавались на вход сети сериями по 64.Next, the images were fed to the network input in series of 64.

Тренировка ИНС.Workout ANN.

В качестве метрики СНС для сегментации использовалась метрика Intersection Over Union (IOU) по каждому каналу и усреднялась, а в качестве функции потерь бралась BCE*0.3+Jaccard*0.7. Это соотношение чаще всего используют в задачах сегментации.As a SNA metric for segmentation, the Intersection Over Union (IOU) metric was used for each channel and averaged, and BCE * 0.3 + Jaccard * 0.7 was taken as a loss function. This ratio is most often used in segmentation tasks.

Тренировка проходила в 3 стадии. На первых двух в качестве оптимизатора брался Adam. На первой замораживался энкодер, и учился только декодер. LR изначально был 0.003 и снижался в 2 раза, если 5 эпох не было улучшения метрики. Если 11 эпох не было улучшения, переходили к второй стадии. На ней все веса размораживались и училась вся сеть, по тому же принципу, но уже с LR = 0.001. И снижения, и переход происходили при 6 и 16 эпохах соответственно. На 3 стадии оптимизатором был RMSProp, а LR менялся по косинусоиде с 0.001 до 0.000001. При этом ее период равнялся 24 эпохам. СНС проходила три таких цикла.The training took place in 3 stages. In the first two, Adam was taken as an optimizer. On the first, the encoder was frozen, and only the decoder studied. LR was initially 0.003 and decreased by 2 times, if 5 eras there was no improvement in the metric. If 11 epochs did not improve, they proceeded to the second stage. On it, all weights were thawed and the entire network studied, according to the same principle, but already with LR = 0.001. Both the decline and the transition occurred during the 6th and 16th eras, respectively. At stage 3, the optimizer was RMSProp, and LR changed in cosine form from 0.001 to 0.000001. At the same time, its period was 24 epochs. SNA went through three such cycles.

Показатели точности позиционирования цефалометрических точек обученными СНС были следующими, приводятся среднеквадратичные ошибки (the mean squared error (MSE) or mean squared deviation (MSD): 0.0000752 - 0.000273, Avg misses: 1.4-11.75, old MSE: 0.00148-0.00155.The accuracy indicators of the positioning of cephalometric points by trained SNAs were as follows; mean square errors (MSE) or mean squared deviation (MSD) are given: 0.0000752 - 0.000273, Avg misses: 1.4-11.75, old MSE: 0.00148-0.00155.

Тренировка SE_ResNeXt-50 проходила в два этапа. Training SE_ResNeXt-50 took place in two stages.

Изображения размером 224х224 подаются в претренированную нейронную сеть SE-ResNeXt-50. Images 224x224 in size are fed into the trained neural network SE-ResNeXt-50.

Сеть извлекает из каждого изображения вектор из 2048 признаков. Извлеченные признаки проходят через конструкцию из четырех полносвязных слоёв (Fully-Connected layer) с использованием Batch Normalization, функции активации ReLU и Dropout. На выходе получаем вектор из 124-х предсказанных координат, принадлежащих 62-м цефалометрическим точкам.The network extracts from each image a vector of 2048 attributes. Extracted features pass through a design of four Fully-Connected layer using Batch Normalization, ReLU and Dropout activation functions. At the output, we get a vector of 124 predicted coordinates belonging to 62 cephalometric points.

Переобучение СНС - это явление, когда построенная модель хорошо аппроксимирует примеры из обучающей выборки, но относительно плохо работает на примерах из тестовой выборки, не участвовавших в обучении. Это связано с тем, что при построении СНС в процессе обучения в выборке обнаруживаются некоторые случайные закономерности, которые отсутствуют в генеральной совокупности.Retraining of the SNA is a phenomenon when the constructed model approximates the examples from the training set well, but relatively poorly works on the examples from the test sample that did not participate in the training. This is due to the fact that when constructing the SNA in the learning process in the sample, some random patterns are found that are absent in the general population.

Для преодоления этой проблемы в последних слоях СНС был использован Dropout, предложенный в работе Srivastava N [8]. Идея этого метода заключается в том, чтобы при тренировке нейронной сети каждый раз случайным образом занулять выход доли нейронов перед полносвязным слоем. При получении предсказаний для тестового набора изображений, вместо зануления случайной доли выходов нейронов, все выходы уменьшаются в два раза. Этот довольно простой алгоритм хорошо зарекомендовал себя для борьбы с переобучением.To overcome this problem, Dropout, proposed by Srivastava N [8], was used in the last layers of the SNA. The idea of this method is to randomly zero out the output of a fraction of neurons in front of a fully connected layer when training a neural network. When obtaining predictions for a test set of images, instead of zeroing out a random fraction of the outputs of neurons, all outputs are halved. This fairly simple algorithm has worked well for fighting retraining.

Полученная модель тренируется методом обратного распространения ошибки. Подробнее о режиме тренировки будет написано ниже.The resulting model is trained by the method of back propagation of error. More information about the training mode will be written below.

Подготовка исходных изображений. Для предупреждения возможных процессов переобучения и для улучшения обобщающей способности модели, к исходным изображениям применялись сильные аугментации. Аугментации - это намеренные искажения и преобразования исходных изображений для того, чтобы искусственно увеличить размер обучающей выборки, что является сильной регуляризацией для модели. То есть СНС, просматривая при тренировке больше различных изображений, лучше обобщается и меньше переобучается. Preparation of source images. To prevent possible retraining processes and to improve the generalizing ability of the model, strong augmentations were applied to the original images. Augmentations are intentional distortions and transformations of the original images in order to artificially increase the size of the training sample, which is a strong regularization for the model. That is, the SNA, viewing more different images during training, is better generalized and less retrained.

Сначала, все изображения приводились к размеру 256х256 пикселей с сохранением пропорцией изображения, чтобы не искажать форму черепа пациентов. Чтобы пропорции сохранялись, изображения дополнялись до квадратной серой заливки. Затем, все изображения проходили предварительные случайные преобразования: At first, all images were reduced to a size of 256x256 pixels with preservation of the proportion of the image so as not to distort the shape of the skull of patients. To keep the proportions, the images were supplemented to a square gray fill. Then, all the images went through preliminary random transformations:

1. Случайное размытие изображения1. Random Image Blur

2. Случайная коррекция яркости2. Random brightness correction

3. Отражение изображения относительно вертикальной оси3. Reflection of the image relative to the vertical axis

4. Изменение масштаба изображения в число раз, выбранное равновероятно из отрезка от 0.6 до 1.44. Changing the image scale by the number of times selected equally likely from a segment from 0.6 to 1.4

5. Поворот на случайный угол от -30° до +30°.5. Rotate by a random angle from -30 ° to + 30 °.

После этого из изображений вырезалось случайное окно размером 224х224 пикселей и соответствующим образом рассчитывались координаты цефалометрических точек так, чтобы они соответствовали новым изображениям.After that, a random window measuring 224x224 pixels was cut out of the images and the coordinates of the cephalometric points were calculated accordingly so that they corresponded to the new images.

После проведенных преобразований, изображения собирались в большие серии и подавались в ИНС для её обучения.After the transformations, the images were collected in large series and submitted to the ANN for its training.

Тренировка СНС. СНС обучалась оптимизировать стандартную для задач детекции ключевых точек функцию потерь MSE (Mean Squared Error), которая определяется следующим образом: SNA training. SNA was trained to optimize the MSE (Mean Squared Error) loss function, which is standard for key point detection tasks, which is defined as follows:

, где

это предсказанные координаты цефалометрических точек, а

это их настоящие значения.

where

these are the predicted coordinates of the cephalometric points, and

these are their true meanings.

В качестве оптимизатора использовался Adam [9].Adam was used as an optimizer [9].

Adam (от англ. adaptive moment estimation) это алгоритм оптимизации, являющейся расширением метода стохастического градиентного спуска.Adam (from the English adaptive moment estimation) is an optimization algorithm that is an extension of the stochastic gradient descent method.

Тренировка CНС проходила в два этапа. На первом этапе претренированный SE-ResNeXt-50 “замораживался”, а тренировались только последние четыре полносвязных слоя. На этом этапе в нейронную сеть изображения отправлялись сериями по 800 штук. Каждую эпоху (под эпохой подразумевается 100 шагов оптимизатора, на каждом из которых CНС просматривает серию из 800 изображений), измеряли MSE на валидационной выборке из 10 изображений, чтобы контролировать процесс обучения СНС. В заявленном решении использовали следующий режим тренировки: сеть тренировалась с начальным параметром скорости обучения (learning rate) равным 0.0001; каждый раз, когда ошибка MSE на валидационной выборке из 10 изображений не уменьшалась в течение 10 эпох, уменьшали параметр скорости обучения в два раза и продолжали обучение. Процедура повторялась до тех пор, пока ошибка на валидационной выборке не перестанет улучшаться.SNA training took place in two stages. At the first stage, the trained SE-ResNeXt-50 “froze”, and only the last four fully connected layers were trained. At this stage, images of 800 pieces were sent to the neural network. Each epoch (under the epoch is meant 100 optimizer steps, on each of which the SNA looks at a series of 800 images), MSE was measured on a validation sample of 10 images to control the learning process of the SNA. In the stated solution, the following training regimen was used: the network trained with an initial parameter of the learning rate equal to 0.0001; every time the MSE error in a validation sample of 10 images did not decrease over 10 epochs, the parameter of the learning rate was halved and the training was continued. The procedure was repeated until the error in the validation sample ceased to improve.

На втором этапе СНС тренировалась целиком (вместе с “замороженным” на первом этапе SE-ResNeXt-50) с аналогичным режимом тренировки. Однако на этом этапе изображения отправлялись в СНС сериями по 185 изображений. Это вызвано необходимостью распространять ошибку сквозь глубокую сеть SE-ResNeXt-50, что накладывает ограничения на размер отдельной серии, связанные с оперативной памятью видеокарт. Соответственно, на этом этапе под эпохой понимается 100 шагов оптимизатора, на каждом из которых СНС просматривает по 185 изображений. At the second stage, the SNA trained as a whole (along with the “frozen” at the first stage SE-ResNeXt-50) with a similar training regimen. However, at this stage, the images were sent to the SNA in a series of 185 images. This is due to the need to propagate the error through the deep network of SE-ResNeXt-50, which imposes restrictions on the size of a single series associated with the RAM of the video cards. Accordingly, at this stage, the era refers to the 100 steps of the optimizer, at each of which the SNA views 185 images.

Для расстановки цефалометрических точек на тестовом наборе данных, из изображения вырезается центральная часть размером 224х224 пикселей и отправляется в СНС. Предсказанные нейросетью координаты пересчитываются обратно, чтобы соответствовать исходному изображению.To place cephalometric points on a test data set, the central part of 224x224 pixels in size is cut out of the image and sent to the SNA. The coordinates predicted by the neural network are recalculated back to match the original image.

Для уточнения предсказаний, в заявленном решении применили Test Time Augmentation: из исходного изображения вырезаются пять окон размером 224х224 пикселей в центре и по углам, а также эти окна отражаются относительно вертикальной оси. Полученные 10 изображений отправляются в СНС. Предсказанные координаты соответствующим образом пересчитываются и усредняются арифметическим средним.To clarify the predictions, the claimed solution used Test Time Augmentation: five windows with a size of 224x224 pixels in the center and at the corners are cut from the original image, and also these windows are reflected relative to the vertical axis. Received 10 images are sent to the SNA. The predicted coordinates are appropriately recalculated and averaged by the arithmetic mean.

3. Результаты3. Results

В качестве метрики в заявленном решении брали RMSE = sqrt(MSE) = …. Для первичных и вторичных точек были использованы разные СНС. Далее приведем оценку результатов:As a metric in the claimed solution, we took RMSE = sqrt (MSE) = .... For the primary and secondary points, different SNAs were used. Next, we give an assessment of the results:

Первичные: MSE: 0.0000752Primary: MSE: 0.0000752

Without misses: 0.0000705Without misses: 0.0000705

Avg misses: 1.4Avg misses: 1.4

old MSE: 0.00148old MSE: 0.00148

Вторичные: MSE: 0.00150Secondary: MSE: 0.00150

Without misses: 0.000273Without misses: 0.000273

Avg misses: 11.75Avg misses: 11.75

old MSE: 0.00155.old MSE: 0.00155.

Все вместе: MSE: 0,0009371160494 All Together: MSE: 0,0009371160494

Without misses:0,0001878370843Without misses: 0.0001878370843

Avg misses: 13,15Avg misses: 13.15

old MSE: 0,001522345679.old MSE: 0.001522345679.

В заявленном решении пользователь может перемещать и корректировать положение цефалометрических точек, программировать цефалометрические расчёты и сохранять их в виде отдельного файла формата .pdf.In the claimed solution, the user can move and adjust the position of cephalometric points, program cephalometric calculations and save them as a separate file in .pdf format.

На Фиг. 2 далее будет представлена общая схема вычислительного устройства (200), обеспечивающего обработку данных, необходимую для реализации заявленного решения. In FIG. 2, a general diagram of a computing device (200) that provides the data processing necessary for the implementation of the claimed solution will be presented.

В общем случае устройство (200) содержит такие компоненты, как: один или более процессоров (201), по меньшей мере одну память (202), средство хранения данных (203), интерфейсы ввода/вывода (204), средство В/В (205), средства сетевого взаимодействия (206).In the General case, the device (200) contains such components as: one or more processors (201), at least one memory (202), data storage means (203), input / output interfaces (204), I / O means ( 205), networking tools (206).

Процессор (201) устройства выполняет основные вычислительные операции, необходимые для функционирования устройства (200) или функциональности одного или более его компонентов. Процессор (201) исполняет необходимые машиночитаемые команды, содержащиеся в оперативной памяти (202).The processor (201) of the device performs the basic computational operations necessary for the operation of the device (200) or the functionality of one or more of its components. The processor (201) executes the necessary computer-readable instructions contained in the random access memory (202).

Память (202), как правило, выполнена в виде ОЗУ и содержит необходимую программную логику, обеспечивающую требуемый функционал.Memory (202), as a rule, is made in the form of RAM and contains the necessary program logic that provides the required functionality.

Средство хранения данных (203) может выполняться в виде HDD, SSD дисков, рейд массива, сетевого хранилища, флэш-памяти, оптических накопителей информации (CD, DVD, MD, Blue-Ray дисков) и т.п. Средство (203) позволяет выполнять долгосрочное хранение различного вида информации, например, вышеупомянутых файлов с наборами данных пользователей, базы данных, содержащих записи измеренных для каждого пользователя временных интервалов, идентификаторов пользователей и т.п.The data storage medium (203) can be implemented as HDD, SSD disks, RAID raid, network storage, flash memory, optical information storage devices (CD, DVD, MD, Blue-Ray disks), etc. The tool (203) allows for long-term storage of various types of information, for example, the aforementioned files with user data sets, a database containing records of time intervals measured for each user, user identifiers, etc.

Интерфейсы (204) представляют собой стандартные средства для подключения и работы с компьютерным устройством, например, USB, RS232, RJ45, LPT, COM, HDMI, PS/2, Lightning, FireWire и т.п.Interfaces (204) are standard tools for connecting and working with a computer device, for example, USB, RS232, RJ45, LPT, COM, HDMI, PS / 2, Lightning, FireWire, etc.

Выбор интерфейсов (204) зависит от конкретного исполнения устройства (200), которое может представлять собой персональный компьютер, мейнфрейм, серверный кластер, тонкий клиент, смартфон, ноутбук, быть частью банковского терминала, банкомата и т.п.The choice of interfaces (204) depends on the specific design of the device (200), which can be a personal computer, mainframe, server cluster, thin client, smartphone, laptop, be a part of a bank terminal, ATM, etc.

В качестве средств В/В данных (205) могут использоваться мышь, джойстик, дисплей (сенсорный дисплей), проектор, тачпад, клавиатура, трекбол, световое перо, динамики, микрофон и т.п.As I / O data means (205), a mouse, joystick, display (touch screen), projector, touchpad, keyboard, trackball, light pen, speakers, microphone, etc. can be used.

Средства сетевого взаимодействия (206) выбираются из устройства, обеспечивающий сетевой прием и передачу данных, например, Ethernet карту, WLAN/Wi-Fi модуль, Bluetooth модуль, BLE модуль, NFC модуль, IrDa, RFID модуль, GSM модем и т.п. С помощью средств (205) обеспечивается организация обмена данными по проводному или беспроводному каналу передачи данных, например, WAN, PAN, ЛВС (LAN), Интранет, Интернет, WLAN, WMAN или GSM.Network communication tools (206) are selected from a device that provides network reception and data transfer, for example, an Ethernet card, WLAN / Wi-Fi module, Bluetooth module, BLE module, NFC module, IrDa, RFID module, GSM modem, etc. Using means (205), the organization of data exchange via a wired or wireless data channel is provided, for example, WAN, PAN, LAN (LAN), Intranet, Internet, WLAN, WMAN or GSM.

Компоненты устройства (200) сопряжены посредством общей шины передачи данных (210).The components of the device (200) are interfaced via a common data bus (210).

В настоящих материалах заявки было представлено предпочтительное раскрытие осуществление заявленного технического решения, которое не должно использоваться как ограничивающее иные, частные воплощения его реализации, которые не выходят за рамки испрашиваемого объема правовой охраны и являются очевидными для специалистов в соответствующей области техники. In the present application materials, the preferred disclosure was presented the implementation of the claimed technical solution, which should not be used as limiting other, private embodiments of its implementation, which do not go beyond the requested scope of legal protection and are obvious to specialists in the relevant field of technology.

ЛитератураLiterature

[1] Lin H.H., Chuang Y.F., Weng J.L., et al. Comparative validity and reproducibility study of various landmark-oriented reference planes in 3-dimensional computed tomographic analysis for patients receiving orthognathic surgery. PLoS One. 2015;10:e0117604.[1] Lin H. H., Chuang Y. F., Weng J. L., et al. Comparative validity and reproducibility study of various landmark-oriented reference planes in 3-dimensional computed tomographic analysis for patients receiving orthognathic surgery. Plos one. 2015; 10: e0117604.

[2] van Vlijmen O.J., Maal T., Berge S.J., et al. A comparison between 2D and 3D cephalometry on CBCT scans of human skulls. Int. J. Oral. Maxillofac. Surg. 2010;39:156-160.[2] van Vlijmen O.J., Maal T., Berge S.J., et al. A comparison between 2D and 3D cephalometry on CBCT scans of human skulls. Int. J. Oral. Maxillofac. Surg. 2010; 39: 156-160.

[3] Farronato G., Garagiola U., Dominici A., et al. “Ten-point” 3D cephalometric analysis using low-dosage cone beam computed tomography. Prog Orthod. 2010;11:2-12.[3] Farronato G., Garagiola U., Dominici A., et al. “Ten-point” 3D cephalometric analysis using low-dosage cone beam computed tomography. Prog Orthod. 2010; 11: 2-12.

[4] Kuehne H. et al. Hmdb51: A large video database for human motion recognition //High Performance Computing in Science and Engineering ‘12. - Springer, Berlin, Heidelberg, 2013. - С. 571-582.[4] Kuehne H. et al. Hmdb51: A large video database for human motion recognition // High Performance Computing in Science and Engineering ‘12. - Springer, Berlin, Heidelberg, 2013 .-- S. 571-582.

[5] Zhe Cao, et al. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. (Submitted on 18 Dec 2018 (v1), last revised 30 May 2019 (this version, v2)). [5] Zhe Cao, et al. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. (Submitted on 18 Dec 2018 (v1), last revised 30 May 2019 (this version, v2)).

[6] Hu J., Shen L., Sun G. Squeeze-and-excitation networks //arXiv preprint arXiv:1709.01507. - 2017. - Т. 7.[6] Hu J., Shen L., Sun G. Squeeze-and-excitation networks // arXiv preprint arXiv: 1709.01507. - 2017 .-- T. 7.

[7] Selim S. Seferbekov, et al. Feature Pyramid Network for Multi-Class Land Segmentation. (Submitted on 9 Jun 2018 (v1), last revised 19 Jun 2018 (this version, v2)).[7] Selim S. Seferbekov, et al. Feature Pyramid Network for Multi-Class Land Segmentation. (Submitted on 9 Jun 2018 (v1), last revised 19 Jun 2018 (this version, v2)).

[8] Srivastava N. et al. Dropout: a simple way to prevent neural networks from overfitting //The Journal of Machine Learning Research. - 2014. - Т. 15. - №. 1. - С. 1929-1958.[8] Srivastava N. et al. Dropout: a simple way to prevent neural networks from overfitting // The Journal of Machine Learning Research. - 2014. - T. 15. - No. 1. - S. 1929-1958.

[9] Kingma D. P., Ba J. Adam: A method for stochastic optimization //arXiv preprint arXiv:1412.6980. - 2014.[9] Kingma D. P., Ba J. Adam: A method for stochastic optimization // arXiv preprint arXiv: 1412.6980. - 2014.

Claims

1. A computer-implemented method for training a convolutional neural network to carry out the marking of tele-roentgenograms in front and side projections, containing the stages in which:

- using a computing device, create a database (DB) with pre-analyzed and marked cephalometric points and their coordinates on tele-roentgenograms in the front and side projections (TRGP and TRGB);

- on the basis of the prepared database, at least one convolutional neural network (SNA) is trained to be marked with cephalometric points of the teleradiograms in the front and side projections (TRGP and TRGB), while during training:

• images from the database with points N marked on them and multilayer masks M for these points are fed to the SNA input;

• SNA analyzes multilayer masks M and sticks them onto N masks M_1 ... M_N according to the number of marked points on the input images;

• find connected components for each layer of masks, with each component being the predicted coordinates of the points in the images;

• compare the predicted coordinates with the coordinates of the marked points on the input images from the database and calculate the standard deviation;

- apply trained at least one SNA for subsequent marking by cephalometric points of the teleradiogram in front and side projections.

2. The method according to p. 1, characterized in that the roentgenograms in the database are saved as an xml file.

3. The method according to p. 1, characterized in that they carry out the preprocessing of images of teleradiographs before they are submitted to the SNA by augmentation method.

4. The method according to p. 1, characterized in that the connected components are found using the function skimage.morphology.label.

5. The method according to p. 1, characterized in that the cephalometric points are placed on such localizations as: soft tissue, bone and dental tissues.