RU2654125C1

RU2654125C1 - Statistical estimation method of a multifactor trend of conditional probability of the occurrence of a studied undesired event in cohort study

Info

Publication number: RU2654125C1
Application number: RU2016148559A
Authority: RU
Inventors: Валерий Федорович Обеснюк
Original assignee: Валерий Федорович Обеснюк
Priority date: 2016-12-09
Filing date: 2016-12-09
Publication date: 2018-05-16

Abstract

FIELD: data processing.

SUBSTANCE: invention relates to the field of statistical research of large amounts of individual data for administrative, commercial, financial, managerial, supervisory and predictive purposes. Essence of a method consists in applying the maximum likelihood method inherent in the method and functionality of the estimation of logistic regression; in addition, unlike logistic regression, a model for linking a probability indicator with risk factors is generated by a neural network itself in the process of its optimal adjustment and is not postulated by an analyst before the research is started.

EFFECT: technical result is the implementation of a flexible estimation method of the main trends of probabilistic indicators of cohort risk analysis on the factors studied, taking into account the statistical significance of the findings and the possibility of selecting competing hypotheses (regression models), taking into account an aprior information.

1 cl, 1 dwg

Description

Изобретение относится к области статистического исследования больших массивов индивидуальных данных для административных, коммерческих, финансовых, управленческих, надзорных и прогностических целей.The invention relates to the field of statistical research of large amounts of individual data for administrative, commercial, financial, managerial, supervisory and prognostic purposes.

Аналоги и прототип. Известны аналогичные способы анализа данных, результаты применения которых могут быть использованы для прогнозирования рисков наступления неблагоприятных событий. Особенно много способов известно в области медицинской статистики. Например, искусственная нейронная сеть (ИНС) применялась в патенте RU 2567038 [Нарезкин Д.В. и др.] и программном комплексе "КиберДоктор" [Таранов Ю.А., per. свидетельство №2015615066, РФ].Analogs and prototype. Similar methods of data analysis are known, the results of which can be used to predict the risks of adverse events. Especially many methods are known in the field of medical statistics. For example, an artificial neural network (ANN) was used in the patent RU 2567038 [Narezkin D.V. et al.] and the CyberDoctor software package [Yu.A. Taranov, per. certificate No. 2015615066, RF].

В отличие от предлагаемого изобретения указанные способы и средства осуществления прогнозирования используют технологию обучения ИНС с применением эвристических методов оценки его качества, в результате чего не производится вероятностной оценки статистической значимости полученных выводов. Прогнозируемые оценки риска при этом оказываются статистически смещенными на неконтролируемую величину, в результате чего в отдельных случаях прогноз наступления неблагоприятного события может быть ошибочным, что выражается в увеличении количества ложноположительных или ложноотрицательных заключений. Наиболее близким к предлагаемому способу по технической сущности и по достигаемому эффекту является способ применения слоистой ИНС Румельхарта с искусственными нейронами, имеющими логистическую (сигмоидную) функцию активации [Нарезкин Д.В. и др., патенте RU 2567038].In contrast to the invention, these forecasting methods and means use ANS training technology using heuristic methods for assessing its quality, as a result of which a probabilistic assessment of the statistical significance of the conclusions is not made. In this case, the predicted risk assessments turn out to be statistically biased by an uncontrolled value, as a result of which, in some cases, the prognosis of an adverse event may be erroneous, which is reflected in an increase in the number of false positive or false negative conclusions. Closest to the proposed method in technical essence and in the achieved effect is a method of using a layered Rumelhart ANN with artificial neurons having a logistic (sigmoid) activation function [D. Narezkin and others, patent RU 2567038].

Однако такой патент защищает сеть с жестко заданной архитектурой межнейронных связей, фиксированных по силе, что не позволяет рекомендовать применение способа для выборки, сильно отличающейся от описанной в патенте. Например, способ прогнозирования течений послеоперационного периода после хирургического лечения рака прямой кишки не может быть перенесен для прогнозирования результата хирургического лечения рака легкого. Недостатком выбранного метода прогнозирования является также невозможность сравнения с конкурирующими гипотезами (прогнозами). Кроме того, не описан алгоритм настройки оптимальных параметров межнейронного взаимодействия, числовые величины которых защищены текстом патента, несмотря на то что выборка из 15 человек, по которой проведено обучение ИНС, никак не могла быть представительной.However, such a patent protects a network with a fixed architecture of interneuron communications, fixed in strength, which does not allow us to recommend the use of the method for sampling, which is very different from that described in the patent. For example, a method for predicting the course of the postoperative period after surgical treatment of colorectal cancer cannot be applied to predict the outcome of surgical treatment of lung cancer. The disadvantage of the chosen method of forecasting is also the impossibility of comparing with competing hypotheses (forecasts). In addition, the algorithm for setting the optimal parameters of interneuron interaction, the numerical values of which are protected by the text of the patent, is not described, despite the fact that the sample of 15 people who trained the ANN could not be representative.

Аналогичное назначение имеют также патенты RU 2456608 [Полоников А.В. и др.] и RU 2492804 [Акимова Е.В. и др.], посвященные оценке показателей риска возникновения гипертонической болезни и кардиоваскулярного риска смерти. Оба патента используют тот или иной способ регрессии выборочных персональных данных. Патент RU 2456608 опирается на способ логистической регрессии. Общим недостатком обоих способов является постулирование линейной зависимости показателя ожидаемого эффекта от совокупности факторов до начала проведения статистического исследования. Реальные данные, однако, почти никогда не следуют простым и недостаточно гибким моделям с низкой избыточностью. Это приводит к избыточному прогнозированию ложноотрицательных и ложноположительных случаев. В то же время в статистике хорошо известно, что увеличение избыточности модели вместо увеличения ее гибкости обычно приводит к снижению обобщающей способности, так как формальные модели связи показателей риска с факторами (например, линейная модель) часто игнорируют типовые априорные свойства вероятностных величин.Patents RU 2456608 [Polonikov A.V. also have a similar purpose. and others] and RU 2492804 [Akimova E.V. et al.], dedicated to the assessment of risk indicators for hypertension and cardiovascular risk of death. Both patents use one way or another regression of selective personal data. Patent RU 2456608 relies on a method of logistic regression. A common drawback of both methods is the postulation of a linear dependence of the indicator of the expected effect on the totality of factors before the start of the statistical study. Real data, however, almost never follows simple and not flexible models with low redundancy. This leads to excessive prediction of false negative and false positive cases. At the same time, it is well known in statistics that an increase in the redundancy of a model instead of an increase in its flexibility usually leads to a decrease in generalizing ability, since formal models for linking risk indicators with factors (for example, a linear model) often ignore typical a priori properties of probability values.

Эта проблема хорошо известна в отраслях статистики биологических и медицинских объектов. Отчасти ее пытаются разрешить в таких разделах, как "Статистика зависимых величин. Таблицы сопряженности" (УДК 519.235), "Оценка смертности. Коэффициенты смертности. Статистика смертности" (УДК 314.48), "Статистический анализ сложных сдвигов: Структурные изменения" (УДК 311.175). Наибольшие успехи биологической и медицинской статистики связаны с применением аппарата классических таблиц сопряженности, гибридных таблиц сопряженности с таблицами дожития (пуассоновская регрессия), логистической регрессии. Однако таблицы сопряженности трудно приспособить к многофакторным исследованиям риска; пуассоновская регрессия не может применяться к оценке риска наступления сравнительно частых событий (онкологическая заболеваемость, болезни системы кровообращения, профессиональные радиационные и химические риски); логистическая регрессия сталкивается с проблемой формулирования и селекции гипотез - регрессионных моделей.This problem is well known in the branches of statistics of biological and medical objects. In part, they are trying to resolve it in such sections as “Statistics of dependent variables. Conjugation tables” (UDC 519.235), “Estimation of mortality. Mortality rates. Mortality statistics” (UDC 314.48), “Statistical analysis of complex shifts: Structural changes” (UDC 311.175) . The greatest successes of biological and medical statistics are associated with the use of the apparatus of classical contingency tables, hybrid contingency tables with survival tables (Poisson regression), and logistic regression. However, contingency tables are difficult to adapt to multivariate risk studies; Poisson regression cannot be applied to assessing the risk of relatively frequent events (cancer incidence, circulatory system diseases, occupational radiation and chemical risks); logistic regression is faced with the problem of formulating and selecting hypotheses - regression models.

Задача изобретения. Предложить гибкий способ оценки основных трендов вероятностных показателей исследования когортного риска по изучаемым факторам с учетом статистической значимости полученных выводов и возможности селекции конкурирующих гипотез (моделей регрессии) с учетом априорной информации.The objective of the invention. To propose a flexible way of assessing the main trends in the probabilistic indicators of cohort risk research by the studied factors, taking into account the statistical significance of the findings and the possibility of selecting competing hypotheses (regression models) taking into account a priori information.

Сущность предлагаемого способа. Вместо общепринятого обучения нейронной сети для повышения ее обобщающей способности в изобретении применяется метод максимального правдоподобия, свойственный способу и функционалу оценки логистической регрессии; кроме того, в отличие от логистической регрессии модель связи вероятностного показателя с факторами риска генерируется самой нейросетью в процессе ее оптимальной настройки и не постулируется аналитиком до начала исследования.The essence of the proposed method. Instead of the generally accepted training of a neural network, in order to increase its generalizing ability, the invention uses the maximum likelihood method inherent in the method and functional of evaluating logistic regression; in addition, unlike logistic regression, a model for the connection of a probabilistic indicator with risk factors is generated by the neural network itself in the process of its optimal tuning and is not postulated by the analyst before the study begins.

Техническая реализация предлагаемого изобретения. Для достижения задачи изобретения используются: 1) база данных с результатами индивидуальных наблюдений исходов в когорте в сопоставлении с индивидуальным рандомизированным списком факторов, предположительно влиявших на исходы; 2) компьютерная программа-имитатор прямого функционирования многослойной ИНС; 3) компьютерная программа оптимизации параметров межнейронных связей ИНС, в которой вместо традиционного функционала оценки, численно минимизирующего норму отклонения результатов функционирования сети от обучающего множества примеров, используется функционал биномиально-логистической регрессии, позволяющий произвести настройку по методу максимального правдоподобия (1).Technical implementation of the invention. To achieve the objectives of the invention, the following are used: 1) a database with the results of individual observations of outcomes in a cohort in comparison with an individual randomized list of factors that presumably influenced the outcomes; 2) a computer program simulating the direct functioning of a multilayer ANN; 3) a computer program for optimizing the parameters of the interneuron connections of the ANN, in which instead of the traditional evaluation functional that numerically minimizes the deviation of the network functioning results from the training set of examples, the binomial-logistic regression functional is used, which allows tuning using the maximum likelihood method (1).

где i - номер индивидуального наблюдения; I_i - индекс индивидуального исхода (1 - неблагоприятное событие наступило; 0 - неблагоприятное событие не наступило); F_i - вектор списка индивидуальных факторов; α(F_i, β) - прогнозируемый индивидуальный отклик ИНС на индивидуальную комбинацию факторов; β - совокупность всех настроечных коэффициентов связей нейронов. Неопределенность отдельных слагаемых функционала раскрывается по правилу 0⋅ln(0)=0. Один из примеров двухслойной искусственной нейронной сети показан на фиг. 1.where i is the number of individual observations; I _i - individual outcome index (1 - an adverse event has occurred; 0 - an adverse event has not occurred); F _i is the vector of the list of individual factors; α (F _i , β) is the predicted individual response of the ANN to an individual combination of factors; β is the totality of all the tuning coefficients of neuron connections. The uncertainty of the individual terms of the functional is revealed by the rule 0⋅ln (0) = 0. One example of a bilayer artificial neural network is shown in FIG. one.

Функционал оценки, являясь статистической суммой по списку наблюдений, позволяет оценить статистический выигрыш от применения некоторой модели (статистической гипотезы H1) по сравнению с исходной моделью с нулевыми настройками (гипотеза Н0 об отсутствии влияния изучаемых факторов на наблюдаемый результат в когорте). Такие оценки позволяют произвести количественную селекцию моделей (конкурирующих гипотез) по известному тесту отношения правдоподобий [Wilks]. Аналогичные оценки позволяют также установить статистическую значимость учета влияния анализируемых факторов по величине статистики G² [Wilks] и достигнутой вероятности ошибки.The evaluation functional, being a statistical sum from the list of observations, allows us to estimate the statistical gain from applying a certain model (statistical hypothesis H1) compared to the original model with zero settings (hypothesis Н0 about the absence of the influence of the studied factors on the observed result in the cohort). Such estimates allow quantitative selection of models (competing hypotheses) using the well-known likelihood ratio test [Wilks]. Similar estimates also allow us to establish the statistical significance of taking into account the influence of the analyzed factors on the value of statistics G ² [Wilks] and the achieved error probability.

Специфическими особенностями предлагаемого способа (кроме перечисленных выше отличительных особенностей) являются:The specific features of the proposed method (in addition to the above distinguishing features) are:

a) центрирование и нормирование индивидуальных наблюдений действующих факторов на входе искусственной нейронной сети так, чтобы сигналы, поступающие на вход, находились в диапазоне [-1; +1];a) centering and normalization of individual observations of acting factors at the input of an artificial neural network so that the signals arriving at the input are in the range [-1; +1];

b) в силу специфических ограничений на величину условной вероятности 0≤Р≤1 и на величину отклика каждого нейрона допустимо центрирование работы всех нейронов сети с сигмоидной функцией активации за исключением нейрона на ее выходе. Результатом центрирования является среднее значение из диапазона возможных откликов нейрона при нулевых значениях коэффициентов связей на его входе. Нейрон на выходе не центрируется. Смещение его выходного сигнала подбирается таким образом, чтобы при нулевых настройках коэффициентов связей нейронов сети статистическая сумма (биномиальный функционал оценки) достигала значения, свойственного гипотезе Н0 об отсутствии влияния факторов;b) due to specific restrictions on the value of the conditional probability 0≤P≤1 and on the response value of each neuron, it is permissible to center the operation of all neurons of the network with a sigmoid activation function, with the exception of the neuron at its output. The result of centering is the average value from the range of possible responses of the neuron at zero values of the coupling coefficients at its input. The output neuron is not centered. The offset of its output signal is selected in such a way that, at zero settings of the neural network connection coefficients, the statistical sum (binomial estimation functional) reaches the value inherent to the H0 hypothesis that there are no influence of factors;

c) включение в состав функционала оценки, используемого для настройки сети, дополнительного штрафного (стабилизирующего, регуляризирующего) слагаемого, препятствующего неограниченному росту настроечных параметров сети в процессе оптимизации. Ограничение роста коэффициентов связей способствует улучшению обобщающей способности ИНС и приводит к незначительным смещениям оценок риска, если ведущим слагаемым в функционале является статистическая сумма. Для регулирования силы штрафа можно трактовать штрафное слагаемое в качестве байесовской поправки, опирающейся на априорную информацию о характере трендов "факторы - риск";c) the inclusion in the evaluation functional used to configure the network, an additional penalty (stabilizing, regularizing) term that impedes the unlimited growth of the network configuration parameters during the optimization process. Limiting the growth of link coefficients improves the generalizing ability of ANNs and leads to insignificant biases in risk assessments if the leading summand in the functional is a statistical sum. To regulate the power of the fine, the penalty term can be interpreted as a Bayesian amendment based on a priori information on the nature of the “factor-risk” trends;

d) для настройки сети не используется традиционный алгоритм обратного распространения ошибок, свойственный многослойным ИНС. Вместо этого применяются эффективные алгоритмы оптимального поиска экстремума в многомерном пространстве факторов. Например, может быть использовано сочетание метода стохастического поиска глобального экстремума с тонкой настройкой градиентным методом сопряженных направлений в малой окрестности экстремума.d) the traditional back-propagation algorithm inherent to a multilayer ANN is not used to configure the network. Instead, effective algorithms are used to optimally search for an extremum in a multidimensional space of factors. For example, a combination of the stochastic global extremum search method and the fine tuning by the gradient method of conjugate directions in a small neighborhood of the extremum can be used.

Техническим результатом предлагаемого гибридного способа статистического исследования является рост достоверности многофакторной оценки трендов вероятностного показателя изучаемого риска в представительной выборке (когорте), выражающийся в уменьшении величины статистического смещения оценок, учете роли мешающих факторов, уменьшении количества ложноположительных и ложноотрицательных прогностических выводов по отношению к объему исследуемой когорты. В результате применения изобретения к исследованию представительной выборки появляется возможность говорить об измерении вероятностных показателей когортного риска.The technical result of the proposed hybrid method of statistical research is the increase in the reliability of multivariate trend estimation of the probabilistic indicator of the studied risk in a representative sample (cohort), expressed in a decrease in the statistical bias of estimates, taking into account the role of interfering factors, reducing the number of false positive and false negative predictive conclusions in relation to the volume of the studied cohort . As a result of applying the invention to the study of a representative sample, it becomes possible to talk about measuring probabilistic indicators of cohort risk.

Пример использования. Работоспособность изобретения иллюстрируется на примере опубликованного эпидемиологического оценивания вероятности смерти от рака кости среди работников специализированного предприятия, подвергавшихся профессиональному облучению от различных источников ионизирующей радиации [Koshurnikova N.A.]. Эта публикация характеризуется противоречием между предписанными объективными требованиями к радиационной защите [Публикация 103 МКРЗ; действующие НРБ-99/2009] и регистрацией авторами исследования отрицательных трендов онкологического риска по дозе внешнего облучения [Koshurnikova N.A.]. Противоречие может быть вызвано неадекватностью использованного способа оценки. Имеется ряд причин, смещающих оценку тренда: 1) нормативные документы предписывают оценку величины пожизненного риска; вместо этого и авторы, и большинство исследователей радиационно-онкологического риска [UNSCEAR] оценивают показатель интенсивности реализации риска; 2) существующие алгоритмы пуассоновской регрессии [Preston D., Epicure] не могут работать со списком индивидуальных наблюдений, нуждаясь в группировке данных, снижающей статистическую мощность исследования; 3) модели трендов риска по факторам постулируются исследователями до выполнения оценок, что приводит к неконтролируемым систематическим искажениям.Usage example. The efficiency of the invention is illustrated by the example of a published epidemiological assessment of the probability of death from bone cancer among employees of a specialized enterprise exposed to occupational exposure from various sources of ionizing radiation [Koshurnikova N.A.]. This publication is characterized by a contradiction between the prescribed objective requirements for radiation protection [Publication 103 ICRP; current NRB-99/2009] and registration by the authors of the study of negative trends in oncological risk by the dose of external exposure [Koshurnikova N.A.]. The contradiction may be caused by the inadequacy of the used assessment method. There are a number of reasons that bias the trend assessment: 1) regulatory documents prescribe the assessment of the value of lifelong risk; instead, both authors and most radiation oncological risk researchers [UNSCEAR] evaluate the rate of risk realization; 2) existing Poisson regression algorithms [Preston D., Epicure] cannot work with a list of individual observations, needing a data grouping that reduces the statistical power of the study; 3) models of risk trends by factors are postulated by researchers before making assessments, which leads to uncontrolled systematic distortions.

Предлагаемое изобретение позволяет устранить перечисленные недостатки, обеспечивая непосредственную оценку условного пожизненного риска, а также переход к биномиально-логистической регрессии, не требующей предварительного группирования данных. Использование искусственной нейронной сети в качестве генератора моделей позволяет отказаться от обычно предполагаемых линейных связей "доза-эффект", заведомо искажающих оценки в области больших доз (больших вероятностей).The present invention allows to eliminate the above disadvantages, providing a direct assessment of the conditional lifetime risk, as well as the transition to binomial and logistic regression, which does not require preliminary grouping of data. Using an artificial neural network as a generator of models allows us to abandon the usually assumed linear dose-effect relationships, which obviously distort the estimates in the field of high doses (high probabilities).

В частности, на опубликованном материале [Koshurnikova N.A.] выполнена повторная оценка трендов. Общая численность выборки составила 3155 мужчин и 1019 женщин. Из них насчитывалось всего 17 случаев смерти от рака кости (остеосаркомы). Все лица в выборке умерли по тем или иным причинам в период до 2008 года с момента найма на предприятие после 1948 года. В число факторов влияния были включены возраст найма, возраст реализации эффекта, пол, поглощенная доза внешнего облучения (до 6,4 Гр) и поглощенная доза внутреннего облучения (до 107,4 Гр - на костную поверхность). Прогностическая величина - условный пожизненный риск - кумулятивная вероятность гибели от рака кости. Были установлены:In particular, the published material [Koshurnikova N.A.] re-evaluated trends. The total sample size was 3155 men and 1019 women. Of these, there were only 17 deaths from bone cancer (osteosarcoma). All individuals in the sample died for one reason or another between 2008 and the time they were hired after 1948. The factors of influence included the age of employment, the age of the effect, gender, absorbed dose of external radiation (up to 6.4 Gy) and absorbed dose of internal radiation (up to 107.4 Gy - on the bone surface). Prognostic value - conditional lifetime risk - cumulative probability of death from bone cancer. The following were installed:

- статистически значимое отличие наблюдаемых трендов от нулевой гипотезы об их отсутствии (Р-value=0.021) даже для ИНС с простейшей архитектурой "5+2+1";- a statistically significant difference between the observed trends from the null hypothesis of their absence (P-value = 0.021) even for ANNs with the simplest architecture "5 + 2 + 1";

- тренды оказались восходящими и слабо-нелинейными как по дозе внешнего облучения, так и по дозе внутреннего облучения, что опровергает выводы аналога, но согласуется с широко распространенным мнением о вреде ионизирующего излучения;- the trends turned out to be upward and slightly non-linear both in the dose of external radiation and in the dose of internal radiation, which refutes the conclusions of the analogue, but is consistent with the widespread opinion about the dangers of ionizing radiation;

- для женщин когортная радиочувствительность по раку кости оказалась выше, чем для мужчин, что согласуется с общебиологической закономерностью;- for women, the cohort radiosensitivity for bone cancer was higher than for men, which is consistent with the general biological pattern;

- центральная оценка коэффициента номинального риска была на уровне 1,0%⋅Гр^-1, что сопоставимо с коэффициентами риска для жертв атомной бомбардировки Хиросимы и Нагасаки для изученного заболевания.- the central assessment of the nominal risk coefficient was at the level of 1.0% ⋅Gy ^-1 , which is comparable with the risk factors for the victims of the atomic bombing of Hiroshima and Nagasaki for the studied disease.

ЛитератураLiterature

1. Нарезкин Д.В., Кузьменков А.Ю., Недзимовская Д.В. Способ прогнозирования течения раннего послеоперационного периода у больных с осложнениями рака прямой кишки и средство его осуществления. - Патент RU 2567038, дата регистрации 24.06.2014.1. Narezkin D.V., Kuzmenkov A.Yu., Nedzimovskaya D.V. A method for predicting the course of the early postoperative period in patients with complications of colorectal cancer and a means for its implementation. - Patent RU 2567038, registration date 06/24/2014.

2. Таранов Ю.А. Программный комплекс «КиберДоктор» - Программа для диагностирования заболеваний щитовидной железы («CyberDoctor: Neuronet-thyroid»). - Per. свидетельство №2015615066 от 07.05.2015.2. Taranov Yu.A. CyberDoctor software package - A program for diagnosing thyroid diseases (CyberDoctor: Neuronet-thyroid). - Per. Certificate No. 2015615066 dated 05/07/2015.

3. Полоников А.В., Солодилова М.А., Иванов В.П. и др. Способ прогнозирования риска возникновения гипертонической болезни у мужчин. - Патент RU 2456608, дата регистрации 15.03.2011.3. Polonikov A.V., Solodilova M.A., Ivanov V.P. et al. A method for predicting the risk of hypertension in men. - Patent RU 2456608, registration date 03/15/2011.

4. Акимова Е.В., Пушкарев Г.С., Гакова Е.И. и др. Способ определения суммарного кардиоваскулярного риска смерти у мужчин. - Патент RU 2492804, дата регистрации 12.05.2012.4. Akimova E.V., Pushkarev G.S., Gakova E.I. et al. A method for determining the total cardiovascular risk of death in men. - Patent RU 2492804, registration date 05/12/2012.

5. Wilks S.S. The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. - The Annals of Mathematical Statistics, 1938, 9, pp. 60-62.5. Wilks S.S. The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. - The Annals of Mathematical Statistics, 1938, 9, pp. 60-62.

6. Koshurnikova, N.A. Bone Cancers in MAYAK Workers. / N.A. Koshurnikova, E.S. Gilbert, M. Sokolnikov [et al] // Radiation Research. - 2000. - 154. - P. 237-245.6. Koshurnikova, N.A. Bone Cancers in MAYAK Workers. / N.A. Koshurnikova, E.S. Gilbert, M. Sokolnikov [et al] // Radiation Research. - 2000. - 154. - P. 237-245.

7. Публикация 103 Международной Комиссии по радиационной защите (МКРЗ). Пер с англ. / Под общей ред. М.Ф. Киселева и Н.К. Шандалы. - М.: Изд. ООО ПКФ «Алана», 2009.7. Publication 103 of the International Commission on Radiation Protection (ICRP). Per from English. / Under the general ed. M.F. Kiseleva and N.K. Shandals. - M.: Publishing. PKF Alana LLC, 2009.

8. Нормы радиационной безопасности НРБ-99/2009. - Санитарные правила и нормативы СанПин 2.6.1.2523-09. - Утверждены постановлением главного государственного санитарного врача РФ от 7 июля 2009 г., №47.8. Radiation safety standards NRB-99/2009. - Sanitary rules and regulations SanPin 2.6.1.2523-09. - Approved by resolution of the Chief State Sanitary Doctor of the Russian Federation of July 7, 2009, No. 47.

9. Effects of Ionizing Radiation. UNSCEAR 2006 Report., Vol. 1A. - NY: United Nations Publication, 2008. - 383 p.9. Effects of Ionizing Radiation. UNSCEAR 2006 Report., Vol. 1A. - NY: United Nations Publication, 2008 .-- 383 p.

10. Preston D., Lubin J., Pierce D. Epicure User's Guide. Release 2. - Hirosoft I.C., 1998. - 344 p.10. Preston D., Lubin J., Pierce D. Epicure User's Guide. Release 2. - Hirosoft I.C., 1998 .-- 344 p.

Claims

A method for statistical estimation of a multifactorial trend of the conditional probability of the occurrence of a studied adverse event in a cohort study, which includes the steps of analyzing the observations of a representative sample with recorded individual outcomes and lists of quantitative and qualitative factors:

(1) the stage of creating or selecting for statistical processing a database containing individual information about the members of the cohort with a list of influencing factors, levels or other quantitative / qualitative characteristics of their impact, as well as the results of individual outcomes;

(2) the step of selecting a computer simulator of a multilayer artificial neural network;

(3) the stage of selecting the functional of statistical estimation and computer optimization of the quality of approximation of observations by a mathematical model of the desired trend;

(4) the decision-making phase of completing a data trend assessment along with an assessment of the quality of its prediction;

(5) the step of assessing the uncertainty of the result using statistical tests,

characterized in that

- a model for the connection of a probabilistic indicator with risk factors is generated by a multilayer artificial neural network in the process of its optimal tuning and is not postulated by the analyst before the study;

- to assess the quality of the neural network, the binomial-logistic regression functional is used, which allows replacing the training of the artificial neural network with its optimization by the maximum likelihood method and performing a posteriori statistical testing;

- in addition to the statistical sum, the stabilizing (regularizing) term is added to the evaluation functional, which improves the generalizing properties of an artificial neural network by limiting the growth trend of tuning coefficients of interneuronal connections that arose due to network redundancy;

- the work of all neurons of an artificial neural network except the last is centered; an offset is introduced into the work of the last neuron, the magnitude of which is selected so that, at zero settings of the neural network connection coefficients, the functional of its evaluation reaches the value characteristic of the null hypothesis that there are no effects of the studied factors.