RU2627673C2

RU2627673C2 - Method for noninvasive prenatal diagnostics of fetal aneuploidy

Info

Publication number: RU2627673C2
Application number: RU2015155052A
Authority: RU
Inventors: Катерина Сергеевна Пантюх; Егор Борисович Прохорчук; Артем Владимирович Артемов
Original assignee: Закрытое акционерное общество "Геноаналитика"
Priority date: 2015-12-22
Filing date: 2015-12-22
Publication date: 2017-08-09
Also published as: RU2015155052A

Abstract

FIELD: medicine.

SUBSTANCE: method for non-invasive prenatal diagnosis of fetal aneuploidy is proposed, which includes extraction of extracellular DNA (ecDNA) from a blood sample of a pregnant woman, selection of genome regions for amplification, preparation of genomic libraries, mapping of the resulting sequences to a reference genome or part of the human genome with coordinates determination, determination of the coverage value for each genome region, characterized by the openness of chromatin between the placenta and mother's blood cells, differing not less than 20%, and obtaining genome regions with the indicated openness of chromatin, after which a conclusion is made about the presence of fetal aneuploidy. A method is proposed to obtain genome regions for carrying out the above noninvasive prenatal diagnosis of fetal aneuploidy.

EFFECT: simple and economical way of prenatal diagnosis of fetal aneuploidy in the early stages of pregnancy.

7 cl, 3 dwg, 3 tbl, 5 ex

Description

Область техники, к которой относится изобретениеFIELD OF THE INVENTION

Изобретение относится к области медицины, а именно неинвазивной пренатальной диагностике анеуплоидий плода по внеклеточной ДНК крови матери, и может быть использовано для определения генетических аномалий плода (анеуплоидий, в т.ч. моносомий и трисомий) на первом триместре беременности безопасными как для ребенка, так и для матери неинвазивными методами.The invention relates to medicine, namely, non-invasive prenatal diagnosis of fetal aneuploidy by extracellular DNA of the mother’s blood, and can be used to determine fetal genetic abnormalities (aneuploidy, including monosomy and trisomy) in the first trimester of pregnancy safe for both the baby and and for the mother by non-invasive methods.

Анеуплоидия является следствием изменений кариотипа, при котором число хромосом в клетках плода не кратно гаплоидному набору (в отличие от нормального состояния кариотипа, эуплоидии, при котором число хромосом равно двум гаплоидным наборам). Примерами анеуплоидии, которая может быть выявлена с использованием заявленного способа, являются моносомия и трисомия, а также частичная трисомия или частичная моносомия (соответственно, приобретение дополнительных копий или делеция крупных участков хромосом, как правило, одного из хромосомных плеч). Частными примерами являются трисомия по 21-й хромосоме (синдром Дауна), трисомия по 13-й хромосоме (синдром Патау), трисомия по 18-й хромосоме (синдром Эдвардса), моносомия по Х-хромосоме (синдром Шерешевского-Тернера) или наличие более чем двух половых хромосом, например, синдром Клайнфельтера (XXY), и т.д. Перечень связанных с анеуплоидиями заболеваний, которые могут быть диагностированы заявленным способом, не ограничен каким-либо специальным образом.Aneuploidy is a consequence of changes in the karyotype, in which the number of chromosomes in the cells of the fetus is not a multiple of the haploid set (in contrast to the normal state of the karyotype, euploidy, in which the number of chromosomes is equal to two haploid sets). Examples of aneuploidy, which can be detected using the claimed method, are monosomy and trisomy, as well as partial trisomy or partial monosomy (respectively, the acquisition of additional copies or deletion of large sections of chromosomes, usually one of the chromosome arms). Particular examples are trisomy on the 21st chromosome (Down syndrome), trisomy on the 13th chromosome (Patau syndrome), trisomy on the 18th chromosome (Edwards syndrome), monosomy on the X chromosome (Shereshevsky-Turner syndrome) or more than two sex chromosomes, for example, Klinefelter’s syndrome (XXY), etc. The list of diseases associated with aneuploidy that can be diagnosed by the claimed method is not limited in any special way.

Уровень техникиState of the art

Из уровня техники известен способ диагностики геномных аномалий плода, в частности способ диагностики наиболее распространенных анеуплоидий с помощью стандартных инвазивных (например, кариотипирования хорионной жидкости или образца плаценты) и неинвазивных методов (биохимия крови, УЗИ).A method for diagnosing fetal genomic abnormalities is known from the prior art, in particular a method for diagnosing the most common aneuploidy using standard invasive (e.g., karyotyping of a chorionic fluid or placenta sample) and non-invasive methods (blood biochemistry, ultrasound).

Однако стандартные неинвазивные технологии обладают недостаточной точностью и позволяют только сформировать группу риска беременных женщин, а инвазивные методы в небольшом проценте случаев (по разным источникам от 0,5 до 2% в зависимости от опыта врачей) могут привести к выкидышу или инфицированию плода.However, standard non-invasive technologies have insufficient accuracy and can only form a risk group for pregnant women, and invasive methods in a small percentage of cases (according to different sources from 0.5 to 2% depending on the experience of doctors) can lead to miscarriage or infection of the fetus.

Из уровня техники известен также способ неинвазивной пренатальной диагностики анеуплоидий плода по внеклеточной ДНК плода в крови матери методом полногеномного секвенирования всех последовательностей вкДНК крови матери (DanS, WangW, RenJ, Clinical application of massively parallel sequencing-based prenatal noninvasive fetal trisomy test for trisomies 21 and 18 in 11 105 pregnancies with mixed risk factors // Prenat Diagn, 2012; Yuan Yuan, Fuman Jiang, Sang Hua, Feasibility Study of Semiconductor Sequencing for Noninvasive Prenatal Feasibility Study of Semiconductor Sequencing for Noninvasive Prenatal Detection of Fetal Aneuploidy // Clinical Chemistry 59: 5, 2013). Этот подход основан на секвенировании всей фракции вкДНК плазмы крови матери и подсчете количества чтений, картируемых на геном. Используя массовое параллельное секвенирование коротких чтений, можно получать за один запуск прибора миллионы чтений вкДНК плазмы крови матери, которая состоит из вкДНК матери в сумме с вкДНК плода. Так как полная последовательность генома известна, каждый прочитанный фрагмент можно картировать на референсный геном и выяснить, какой хромосоме он принадлежит. При наличии анеуплоидий той или иной хромосомы при подсчете количества чтений, принадлежащих этой хромосоме, будет статистически достоверно увеличено. Увеличение количества чтений при таком подходе будет невелико. Например, при условии наличия трисомии по 21 хромосоме и процентном содержании фракции фетальной ДНК 20%, сравнительное увеличение количества чтений 21 хромосомы будет (0.8×2)+(0.2×3)=2.2 в сравнении с количеством чтений в норме (0.8×2)+(0-2×2)=2, то есть сравнительное увеличение количества чтений составит 10%.A method is also known from the prior art for noninvasive prenatal diagnosis of fetal aneuploidy by extracellular DNA of the fetus in the mother’s blood by the method of genome sequencing of all maternal blood cfDNA sequences (DanS, WangW, RenJ, Clinical application of massively parallel sequencing-based prenatal noninvasive fetal trisomy test for trisomies 18 in 11 105 pregnancies with mixed risk factors // Prenat Diagn, 2012; Yuan Yuan, Fuman Jiang, Sang Hua, Feasibility Study of Semiconductor Sequencing for Noninvasive Prenatal Feasibility Study of Semiconductor Sequencing for Noninvasive Prenatal Detection of Fetal Aneuploidy // Clinical Chemistry 59 : 5, 2013). This approach is based on sequencing the entire fraction of maternal blood plasma cfDNA and counting the number of readings mapped to the genome. Using massive parallel sequencing of short readings, you can get millions of readings of maternal blood plasma cfDNA in one run of the device, which consists of maternal cfDNA in total with fetal cfDNA. Since the complete sequence of the genome is known, each read fragment can be mapped to the reference genome and find out which chromosome it belongs to. In the presence of aneuploidy of one or another chromosome, when counting the number of readings belonging to this chromosome, it will be statistically significantly increased. The increase in the number of readings with this approach will be small. For example, subject to the presence of trisomy on the 21st chromosome and a percentage of the fetal DNA fraction of 20%, a comparative increase in the number of readings of the 21st chromosome will be (0.8 × 2) + (0.2 × 3) = 2.2 in comparison with the number of readings in the norm (0.8 × 2) + (0-2 × 2) = 2, that is, a comparative increase in the number of readings will be 10%.

Именно из-за необходимости детектировать очень небольшое увеличение чтений для достоверного определения трисомии необходимо секвенировать большое количество последовательностей вкДНК (10-12 млн. последовательностей). Получение такого большого количества данных требует дорогостоящего параллельного геномного секвенирования с использованием секвенаторов следующего поколения, что не позволяет внедрить данную технологию в повседневную практику. Поэтому разработка новых подходов, позволяющих снизить стоимость проведения тестирования при сохранении достоверности получаемого результата, критически необходима.It is precisely because of the need to detect a very small increase in readings in order to reliably determine trisomy; it is necessary to sequence a large number of cfDNA sequences (10-12 million sequences). Obtaining such a large amount of data requires expensive parallel genomic sequencing using next-generation sequencers, which does not allow introducing this technology into everyday practice. Therefore, the development of new approaches to reduce the cost of testing while maintaining the reliability of the result is critical.

Из уровня техники известен способ диагностики анеуплоидий плода методом полногеномного секвенирования (Патент US 8318430). Данный способ предполагает определение трисомии в результате секвенирования предопределенных последовательностей всего генома. Этот метод учитывает неравномерность секвенирования, связанную с GC-составом читаемой ДНК; таковая зависимость обычно является нелинейной и варьируется не только между разными технологиями секвенирования, но также между разными приборами одной серии и версиями используемых реактивов. А также, вместо единой кумулятивной метрики по целой хромосоме используется разбиение генома на множество коротких участков (окон), и подсчета количества чтений, приходящихся на каждое такое окно, в результате чего определение анеуплоидий производится посредством сравнения двух выборок: окон с исследуемой хромосомы и окон со всех остальных хромосом.The prior art method for the diagnosis of fetal aneuploidy by the method of genome sequencing (Patent US 8318430). This method involves the determination of trisomy as a result of sequencing of predefined sequences of the entire genome. This method takes into account the uneven sequencing associated with the GC composition of the read DNA; this dependence is usually non-linear and varies not only between different sequencing technologies, but also between different devices of the same series and versions of the reagents used. And also, instead of a single cumulative metric on the whole chromosome, the genome is divided into many short sections (windows), and the number of readings per each window is counted, as a result of which aneuploidy is determined by comparing two samples: windows from the chromosome under study and windows from all other chromosomes.

Однако данный способ также основан на необходимости получения большого количества чтений, что увеличивает время проведения теста.However, this method is also based on the need to obtain a large number of readings, which increases the test time.

Наиболее близким к заявляемому является способ диагностики анеуплоидий плода по вкДНК плода в крови матери с использованием дифференциального метилирования ДНК матери и плода (Заявка на изобретение RU 2012119187). Данная технология позволяет сократить время проведения анализа за счет выборочного секвенирования только тех фрагментов генома, которые дифференциально метилированы у плода и у матери. Для этого проводят амплификацию специально отобранных дифференциально метилированных регионов (ДМР), после чего проводят бисульфитную конвертацию полученных фрагментов ДНК и определяют последовательность конвертированных фрагментов. Благодаря бисульфитной конвертации возможно точно отделить чтения плода от чтений матери и достоверно определить наличие трисомии с гораздо меньшим, по сравнению с полногеномным методом, набором данных.Closest to the claimed one is a method for diagnosing fetal aneuploidy by fetal cfDNA in maternal blood using differential methylation of maternal and fetal DNA (Patent Application RU 2012119187). This technology allows to reduce the analysis time due to selective sequencing of only those fragments of the genome that are differentially methylated in the fetus and in the mother. To do this, amplification of specially selected differentially methylated regions (DMR) is carried out, after which bisulfite conversion of the obtained DNA fragments is carried out and the sequence of the converted fragments is determined. Thanks to bisulfite conversion, it is possible to accurately separate fetal readings from mother readings and reliably determine the presence of trisomy with a much smaller data set compared to the genome-wide method.

Однако профиль метилирования обладает индивидуальными особенностями у каждого человека, что может приводить к снижению точности тестирования и увеличивать минимальное необходимое количество данных, а значит, и стоимость теста. Поэтому важной задачей является поиск нового селективного подхода, основанного на отличиях последовательности ДНК матери и плода.However, the methylation profile has individual characteristics for each person, which can lead to a decrease in the accuracy of testing and increase the minimum required amount of data, and hence the cost of the test. Therefore, an important task is to find a new selective approach based on differences in the DNA sequence of the mother and fetus.

В настоящем изобретении предлагается новый подход к определению анеуплоидий плода с помощью секвенирования целевых участков генома (как и в последнем упомянутом подходе), однако основанный на отличии открытости хроматина между клетками крови матери и плаценты плода, при этом дополнительно используется новый этап, связанный с добавлением вырожденных меток до этапа приготовления геномных библиотек, на основании которых в дальнейшем будет производиться удаление ПЦР-дупликатов, которые вносят сдвиг в распределение покрытий регионов.The present invention proposes a new approach to the determination of fetal aneuploidy by sequencing of the target genome (as in the last mentioned approach), however, based on the difference in chromatin openness between the blood cells of the mother and the placenta of the fetus, this additionally uses a new stage associated with the addition of degenerate labels to the stage of preparation of genomic libraries, on the basis of which PCR duplicates will be removed in the future, which introduce a shift in the distribution of regional coatings.

Раскрытие изобретенияDisclosure of invention

Задачей изобретения является создание нового способа пренатальной диагностики анеуплоидий по вкДНК плода в крови матери.The objective of the invention is the creation of a new method of prenatal diagnosis of aneuploidy by vcDNA of the fetus in the blood of the mother.

Ввиду тяжести заболеваний, связанных с анеуплоидией, постановка соответствующего диагноза может являться основанием для проведения аборта, в связи с чем, имеет большое значение скорость проведения такой диагностики, точность постановки результата и возможность проведения исследований в более ранние сроки беременности неинвазивными методами, безопасными как для ребенка, так и для матери.Due to the severity of diseases associated with aneuploidy, making an appropriate diagnosis may be the basis for an abortion, and therefore the speed of such a diagnosis, the accuracy of the setting of the result and the possibility of conducting studies in earlier pregnancy with non-invasive methods that are safe for the baby are of great importance so for mother.

Техническим результатом является получение более простого и экономичного способа пренатальной диагностики анеуплоидий плода с получением надежного результата при сохранении высокой, сопоставимой с описанными выше подходами, точности определения анеуплоидий на ранних этапах беременности.The technical result is to obtain a simpler and more cost-effective method of prenatal diagnosis of fetal aneuploidy with obtaining a reliable result while maintaining a high accuracy of determining aneuploidy in the early stages of pregnancy, which is comparable with the approaches described above.

Поставленная задача решается тем, что способ неинвазивной пренатальной диагностики анеуплоидий плода включает следующие этапы:The problem is solved in that the method of non-invasive prenatal diagnosis of fetal aneuploidy includes the following steps:

a. выделение внеклеточной ДНК (вкДНК) из образца крови, полученной у беременной женщины;a. the extraction of extracellular DNA (cfDNA) from a blood sample obtained from a pregnant woman;

b. внесение к фрагментам вкДНК молекулярных меток, содержащих вырожденную последовательность нуклеотидов, универсальную последовательность и последовательность, комплементарную регионам генома вкДНК, характеризующимся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%,b. introducing molecular labels to the cfDNA fragments containing a degenerate nucleotide sequence, a universal sequence and a sequence complementary to regions of the cfDNA genome characterized by chromatin openness between the placenta and the mother’s blood cells, which differs by at least 20%,

c. полученные на этапе b) фрагменты вкДНК с внесенной молекулярной меткой амплифицируют с использованием праймеров, отжигающихся на универсальную последовательность молекулярных меток с одной стороны, и специфических праймеров, отжигающихся на регионы генома вкДНК, характеризующихся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%,c. obtained at step b) fragments of cfDNA with the introduced molecular label are amplified using primers annealed onto a universal sequence of molecular labels on the one hand and specific primers annealed to regions of the cfDNA genome characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by no less than 20%

d. приготовление геномных библиотек из полученных на этапе с) ампликонов;d. preparation of genomic libraries from amplicons obtained in step c);

e. определение последовательности нуклеотидов (секвенирование) полученных геномных библиотек;e. determination of the nucleotide sequence (sequencing) of the resulting genomic libraries;

f. картирование полученных последовательностей (чтений или ридов) на референсный геном или части генома человека с определением их координат;f. mapping of the obtained sequences (readings or reads) to the reference genome or parts of the human genome with the determination of their coordinates;

g. удаление ПЦР дупликатов из картированных последовательностей (чтений);g. removal of PCR duplicates from mapped sequences (readings);

h. определение значения покрытия для каждого региона генома, характеризующегося открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%;h. determination of the coverage value for each region of the genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%;

i. корректировка полученного значения покрытия для каждого региона генома, полученного на этапе h, на общее покрытие генома, с последующим сравнением скорректированного значения покрытия со значениями покрытий или их распределений, полученных для обучающей выборки образцов крови беременных женщин при эуплоидии и анеуплоидии плода и определение принадлежности исследуемого образца к одной из данных групп, по которому делают вывод о наличии анеуплоидий плода.i. adjustment of the obtained coverage value for each region of the genome obtained in step h to the total coverage of the genome, followed by comparison of the adjusted coating value with the values of coatings or their distributions obtained for the training sample of pregnant women blood samples for euploidy and aneuploidy of the fetus and determination of the belonging of the test sample to one of these groups, which make a conclusion about the presence of fetal aneuploidy.

Определение последовательности, комплементарной регионам генома вкДНК, характеризующимся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20% (или выбор регионов генома для проведения целевой амплификации), осуществляют из базы данных покрытий кандидатных регионов генома для образцов крови беременных женщин с эуплоидией и анеуплоидией, при этом вычисляют значимость отличия покрытия между образцами с эу- и анеуплоидией для каждого кандидатного региона, характеризующуюся значением p-value с учетом корректировки на общее покрытие образца, и выбирают из кандидатных регионов генома те регионы, которые характеризуются значением p-value не более 0,1.The determination of the sequence complementary to the regions of the cfDNA genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by no less than 20% (or the selection of genome regions for targeted amplification), is carried out from the database of coatings of candidate genome regions for blood samples of pregnant women euploidy and aneuploidy, while calculating the significance of the difference in coverage between samples with eu and aneuploidy for each candidate region, characterized by a p-value taking into account adjusting the total coating sample, and is selected from the candidate genomic regions are the regions which have a p-value value of not more than 0.1.

При этом определение принадлежности образца к группе с эуплоидией или анеуплоидией плода осуществляют следующим образом:In this case, the determination of the belonging of the sample to the group with euploidy or aneuploidy of the fetus is carried out as follows:

a. для каждого региона, характеризующегося открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%, вычисляют р-value, которое определяет вероятность наблюдать полученное значение покрытия или более экстремальное значение при условии, что данное значение соответствует распределению покрытий для беременности без анеуплоидий, и p-value того, что его покрытие получено из распределения покрытий для беременности с анеуплоидией по данной хромосоме по БД покрытий кандидатных регионов генома для образцов крови беременных женщин с эуплоидией и анеуплоидией;a. for each region characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, p-value is calculated, which determines the probability of observing the obtained coating value or a more extreme value, provided that this value corresponds to the distribution of coatings for pregnancy without aneuploidy, and the p-value of the fact that its coating is obtained from the distribution of coatings for pregnancy with aneuploidy on this chromosome according to the database of coatings of candidate regions of the genome for blood samples b Pregnant women with euploidy and aneuploidy;

b. вычисляют произведение по всем регионам полученных значений p-value для вычисления p-value того, что значения покрытия регионов, характеризующихся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%, получены из распределения покрытий для беременности без анеуплоидий, и p-value того, что значения покрытия набора регионов получены из распределения покрытий для беременности с анеуплоидией по данной хромосоме.b. calculating the product for all regions of the obtained p-values to calculate the p-value of the fact that the coverage values for regions characterized by chromatin openness between the placenta and the mother’s blood cells, differing by at least 20%, are obtained from the distribution of coatings for pregnancy without aneuploidy, and p-value of the fact that the coverage values of a set of regions are obtained from the distribution of coatings for pregnancy with aneuploidy on this chromosome.

c. по полученным произведениям p-value и априорным вероятностям наличия анеуплоидий у плода (риск по популяции) вычисляют по теореме Байеса вероятности наличия анеуплоидий или эуплоидии в исследуемом образце.c. according to the obtained p-value products and the a priori probabilities of the presence of aneuploidy in the fetus (population risk), the probabilities of the presence of aneuploidy or euploidy in the test sample are calculated by the Bayes theorem.

Вывод о наличии или отсутствии анеуплоидий плода делают, если вероятность для одного из вариантов диагноза не превышает порог значимости из интервала 0.01-0.1, а вероятность для другого варианта превышает порог значимости, при этом диагноз ставится по наибольшему значению вероятности, и в случае, если оба p-value выше или оба ниже порога значимости, диагноз не ставится.The conclusion about the presence or absence of fetal aneuploidy is made if the probability for one of the diagnosis options does not exceed the significance threshold from the interval 0.01-0.1, and the probability for the other option exceeds the significance threshold, and the diagnosis is made according to the highest probability value, and if both p-value is above or both below the threshold of significance, the diagnosis is not made.

Так как в основе системы расчета наличия анеуплоидий или эуплоидии лежит определение покрытия выбранных регионов генома, важно максимально исключить все возможные сдвиги, вносимые в представленность фрагментов вкДНК в процессе пробоподготовки образцов перед секвенированием. Основным этапом, на котором вносится сдвиг в первоначальное распределение фрагментов вкДНК, является амплификация фрагментов вкДНК. Неравномерная амплификация различных последовательностей может приводить к перепредставленности одних и потере других отдельных индивидуальных фрагментов. Эта проблема решается внесением во вкДНК вырожденной молекулярной метки.Since the calculation system for the presence of aneuploidy or euploidy is based on determining the coverage of selected regions of the genome, it is important to exclude as much as possible all possible shifts introduced into the representation of cfDNA fragments during sample preparation before sequencing. The main stage at which a shift is made in the initial distribution of cfDNA fragments is the amplification of cfDNA fragments. Uneven amplification of different sequences can lead to overrepresentation of some and the loss of other individual individual fragments. This problem is solved by introducing a degenerate molecular tag into the cfDNA.

Поставленная задача решается также тем, что найденные регионы генома для определения анеуплоидий плода методом секвенирования, характеризующиеся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%, могут быть представлены на машиночитаемом носителе информации. При этом количество найденных упомянутых регионов генов составляет не менее 10 с указанием геномных координат каждого региона, для которых открытость хроматина между плацентой и клетками крови матери отличается не менее чем на 20%. Для входящих в набор регионов генома значимо (p-value<0.1) отличается покрытие между образцами с плодом без анеуплоидий и образцами с анеуплоидией плода по конкретной хромосоме, с учетом корректировки на общее покрытие образца.The problem is also solved by the fact that the found regions of the genome for determining fetal aneuploidy by sequencing, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, can be presented on a computer-readable information carrier. Moreover, the number of the mentioned regions of the genes found is at least 10, indicating the genomic coordinates of each region for which the chromatin openness between the placenta and the mother’s blood cells differs by at least 20%. For the genome regions included in the set (p-value <0.1), the coverage between samples with a fetus without aneuploidy and samples with aneuploidy of a fetus on a particular chromosome is significantly different, taking into account adjustments for the total coverage of the sample.

Поставленная задача решается также тем, что способ получения регионов генома, характеризующихся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%, для неинвазивной пренатальной диагностики анеуплоидий плода по вкДНК крови матери методом секвенирования включает следующие этапы:The problem is also solved by the fact that the method of obtaining regions of the genome characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, for non-invasive prenatal diagnosis of fetal aneuploidy by maternal blood vcDNA by sequencing includes the following steps:

a. получение данных секвенирования (полногеномного или таргетного) вкДНК крови матери такого, чтобы все кандидатные регионы генома, характеризующиеся открытостью хроматина между плацентой и клетками крови матери и отличающиеся не менее чем на 20%, были прочитаны для образцов крови нескольких беременных женщин без анеуплоидий плода (не менее 5 образцов) и нескольких беременных женщин с анеуплоидией плода по конкретной хромосоме (не менее 5 образцов на каждую анеуплоидию);a. obtaining sequencing data (of full genome or targeted) in maternal blood cfDNA such that all candidate regions of the genome characterized by chromatin openness between the placenta and maternal blood cells and differing by at least 20% are read for blood samples of several pregnant women without fetal aneuploidy (not less than 5 samples) and several pregnant women with fetal aneuploidy on a particular chromosome (at least 5 samples for each aneuploidy);

b. картирование полученных чтений на референсный геном человека для определения их координат (номера хромосомы и позиции на ней);b. mapping of readings to the reference human genome to determine their coordinates (chromosome number and position on it);

c. определение покрытия каждого кандидатного региона каждого полученного образца;c. determination of the coverage of each candidate region of each sample received;

d. вычисление для каждого региона значимости отличия (характеризующейся значением p-value) покрытия между образцами с плодом без анеуплоидий и образцами с анеуплоидией плода по конкретной хромосоме, с учетом корректировки на общее покрытие образца.d. calculation for each region of the significance of the difference (characterized by the p-value) of the coating between samples with the fetus without aneuploidy and samples with aneuploidy of the fetus on a particular chromosome, taking into account adjustments for the total coverage of the sample.

e. выбор из кандидатных регионов генома регионов, характеризующихся значением p-value не более 0,1, из которых составляют набор регионов генома для определения анеуплоидий плода.e. selection of candidate regions from the genome of the genome characterized by a p-value of not more than 0.1, of which a set of regions of the genome is used to determine fetal aneuploidy.

Этап d осуществляют в предположении отрицательного биномиального распределения покрытия региона в образце, например, с использованием программного обеспечения для определения дифференциальной экспрессии РНК DESeq.Step d is performed under the assumption of a negative binomial distribution of the region coverage in the sample, for example, using DESeq RNA differential expression software.

Для регионов, найденных в пункте d, аналогично пунктам b-е может быть вычислено покрытие прочтениями в образцах крови не менее 5 мужчин и для каждого региона определяют значимость, выражающуюся в виде p-value отличий между покрытием ДНК в образцах мужчин и беременных женщин с плодом без анеуплоидий, отбирают участки с p-value не более 0.1.For the regions found in point d, similarly to points b-e, coverage of readings in blood samples of at least 5 men can be calculated and for each region the significance is expressed, expressed as the p-value of the differences between the DNA coating in samples of men and pregnant women with a fetus without aneuploidy, sites with a p-value of not more than 0.1 are selected.

Упрощение заявляемого способа по сравнению со способом, представленным в материалах заявки на изобретение RU 2012119187, достигается за счет исключения из процесса пробоподготовки стадии бисульфитной конвертации библиотек. Стадия бисульфитной конвертации геномных библиотек необходима для определения статуса метилирования выбранных для анализа регионов геномной ДНК. На основании отличия статуса метилирования последовательностей материнской вкДНК и вкДНК плода происходит разделение чтений матери и ребенка. Возможность разделить чтения матери и ребенка позволяет получить достоверный результат, секвенируя только небольшую часть генома, однако проведение дополнительных манипуляций с исходным материалом вкДНК может вносить погрешность в систему определения анеуплоидий с использованием отличия статуса метилирования, а так же требует дополнительных затрат времени и реактивов, что требует поиска возможности разделять чтения матери и ребенка без проведения дополнительных манипуляций с исходным материалом.The simplification of the proposed method compared to the method presented in the materials of the application for the invention RU 2012119187, is achieved by excluding from the process of sample preparation the stage of bisulfite conversion of libraries. The stage of bisulfite conversion of genomic libraries is necessary to determine the methylation status of genomic DNA regions selected for analysis. Based on the difference in the methylation status of the maternal cfDNA and fetal cfDNA sequences, the readings of the mother and child are separated. The ability to separate the readings of the mother and the child allows one to obtain a reliable result by sequencing only a small part of the genome, however, additional manipulations with the source material of cfDNA may introduce errors in the system for determining aneuploidy using differences in methylation status, and also requires additional time and reagents, which requires the search for the opportunity to share the readings of the mother and the child without additional manipulations with the source material.

Предлагаемый способ позволяет делать заключение о наличии трисомии, основываясь на дифференциальной доступности хроматина, а не на статусе метилирования отдельных регионов генома. Доступность хроматина влияет на эффективность работы ДНКаз - ферментов, приводящих к расщеплению последовательности ДНК. Степень доступности хроматина зависит от многих факторов, в том числе доступность хроматина выше для регионов генома, в которых находятся активные промотеры работающих генов, а также для регионов генома, свободных от нуклеосом. Для клеток крови матери и плода, координаты регионов, характеризующиеся повышенной доступностью хроматина, будут отличаться за счет, например, того, что в клетках крови взрослого человека и клетках плаценты (как было показано ранее, именно плацента является основным источником вкДНК плода в крове матери) активно экспрессируется разный набор генов. При определении нуклеотидной последовательности регионов генома, которые характеризуются высокой степенью доступности хроматина (высокая степень доступности хроматина предполагает, что покрытие данного региона генома не менее чем на 20% выше среднего) для матери и низкой для плода (низкая степень доступности хроматина предполагает, что покрытие данного региона генома не менее чем на 20% ниже среднего) и подсчете количества чтений, относящихся к этим регионам, можно ожидать, что все чтения будут относиться именно к вкДНК плода. Соответственно при наличии анеуплоидий будет наблюдаться изменение представленности чтений в этом регионе. Так как эти чтения относятся строго к вкДНК плода, процент изменения количества чтений будет выше, чем процент увеличения общего количества чтений (матери и плода) при определении трисомии полногеномным методом.The proposed method allows to make a conclusion about the presence of trisomy, based on the differential availability of chromatin, and not on the methylation status of individual regions of the genome. The availability of chromatin affects the efficiency of DNAse enzymes, which lead to cleavage of the DNA sequence. The degree of chromatin availability depends on many factors, including the availability of chromatin for regions of the genome where active promoters of working genes are located, as well as for regions of the genome that are free of nucleosomes. For maternal and fetal blood cells, the coordinates of regions characterized by increased chromatin availability will differ due, for example, to the fact that in adult and red blood cells (as shown earlier, it is the placenta that is the main source of fetal cfDNA in maternal blood) a different set of genes is actively expressed. When determining the nucleotide sequence of regions of the genome that are characterized by a high degree of chromatin availability (a high degree of chromatin availability implies that the coverage of a given region of the genome is not less than 20% higher than average) for the mother and low for the fetus (a low degree of chromatin availability means that region of the genome is not less than 20% below the average) and counting the number of readings related to these regions, it can be expected that all readings will relate specifically to the fetal cfDNA. Accordingly, in the presence of aneuploidy, a change in the representation of readings in this region will be observed. Since these readings relate strictly to the fetal cfDNA, the percentage of changes in the number of readings will be higher than the percentage of increase in the total number of readings (mother and fetus) in determining trisomy by the genome-wide method.

Таким образом, заявляемый способ дает возможность разделять чтения матери и ребенка без проведения дополнительных манипуляций с исходным материалом вкДНК, что повышает степень надежности получаемых данных.Thus, the claimed method makes it possible to share the readings of mother and child without additional manipulations with the source material of cfDNA, which increases the degree of reliability of the data obtained.

Краткое описание чертежейBrief Description of the Drawings

Изобретение поясняется чертежами.The invention is illustrated by drawings.

На фиг. 1 представлена краткая схема проведения теста на определение трисомии, основанного на дифференциальной доступности хроматина.In FIG. Figure 1 shows a brief diagram of a trisomy test based on differential chromatin availability.

На фиг. 2 представлен график, изображающий усредненное покрытие генома в окрестностях сайтов гиперчувствительности к ДНКазе чтениями, полученными при секвенировании образца свободно циркулирующей в крови ДНК. Заметно снижение покрытия в окрестностях сайтов гиперчувствительности к ДНКазе. По оси х отложены позиции в геноме относительно сайта гиперчувствительности. Средняя часть графика - непосредственно сайт, левая и правая - его окрестности, где каждая точка соответствует отрезку длиной в 10 нуклеотидов. По оси у обозначено суммарное покрытие участка по всем анализируемым образцам, усредненное по всем участкам гиперчувствительности к ДНКазе.In FIG. Figure 2 is a graph depicting the average coverage of the genome in the vicinity of DNase hypersensitivity sites by readings obtained by sequencing a sample of DNA that freely circulates in the blood. A noticeable decrease in coverage in the vicinity of DNase hypersensitivity sites. The x axis shows the position in the genome relative to the hypersensitivity site. The middle part of the graph is the site itself, the left and right are its surroundings, where each point corresponds to a segment of 10 nucleotides in length. The y-axis indicates the total coverage of the site for all analyzed samples, averaged over all sites of hypersensitivity to DNase.

На фиг. 3 представлен пример методики внесения вырожденной метки и проведения целевой амплификации регионов генома.In FIG. Figure 3 presents an example of a technique for introducing a degenerate label and for carrying out targeted amplification of genome regions.

Осуществление изобретенияThe implementation of the invention

Способ пренатальной диагностики анеуплоидий по вкДНК плода в крови матери включает исследование сыворотки крови матери. Для исследования кровь забирают в вакуумную пробирку, центрифугируют для отделения плазмы от клеточной массы. Из плазмы крови выделяют вкДНК на колонках, после чего к фрагментам вкДНК вносят вырожденные молекулярные метки и делают геномные библиотеки, Далее определяют нуклеотидную последовательность фрагментов геномной библиотеки, которая заключается в цифровом анализе внеклеточной ДНК посредством секвенирования. В основу способа легла методика массового параллельного полногеномного секвенирования, которая позволяет получать до миллиарда коротких чтений за счет случайной фрагментации и последующей амплификации геномной ДНК. Полученные короткие чтения последовательностей ДНК подвергаются статистическому анализу (который может быть реализован программным путем), который включает этап удаления ПЦР дупликатов.A method for prenatal diagnosis of aneuploidy by fetal cfDNA in maternal blood includes a study of maternal blood serum. For research, blood is collected in a vacuum tube, centrifuged to separate the plasma from the cell mass. CDNA is isolated from blood plasma on the columns, after which degenerate molecular marks are added to the cDNA fragments and genomic libraries are made. Next, the nucleotide sequence of the fragments of the genomic library is determined, which consists in digital analysis of extracellular DNA by sequencing. The method was based on the technique of mass parallel genome-wide sequencing, which allows you to get up to a billion short readings due to random fragmentation and subsequent amplification of genomic DNA. The obtained short readings of DNA sequences are subjected to statistical analysis (which can be implemented programmatically), which includes the step of removing PCR duplicates.

Ниже каждый этап заявляемого способа представлен более детально.Below, each stage of the proposed method is presented in more detail.

Забор кровиBlood sampling

Материалом для исследований служит венозная кровь беременной женщины, что позволяет исключить риск инфекции плода или выкидыша, который присутствует при проведении теста стандартными инвазивными методиками, такими как биопсия хориона или амниоцентез. Периферическую кровь матери собирают, например, в две 9 мл пробирки, содержащие ЭДТА для предотвращения коагуляции. После забора крови содержимое пробирок перемешивают (переворачиванием пробирки вверх - вниз 10 раз). Далее пробирки незамедлительно перевозят в лабораторию для заготовки плазмы. Перевозка пробирок должна проходить при +4C° для предотвращения разрушения клеток крови матери и увеличения фракции геномной ДНК матери, содержащейся во вкДНК плазмы крови. Заготовка плазмы должна проводиться не позже чем через 4 часа после забора крови (это необходимо для предотвращения обогащения фракции вкДНК геномной ДНК матери из разрушающихся клеток крови матери).The research material is the venous blood of a pregnant woman, which eliminates the risk of fetal infection or miscarriage, which is present during the test using standard invasive methods, such as chorionic biopsy or amniocentesis. Maternal peripheral blood is collected, for example, in two 9 ml tubes containing EDTA to prevent coagulation. After blood sampling, the contents of the tubes are mixed (by turning the tube up and down 10 times). Next, the tubes are immediately transported to the laboratory for the preparation of plasma. Test tubes should be transported at + 4 ° C to prevent the destruction of the mother’s blood cells and increase the fraction of the mother’s genomic DNA contained in the blood plasma cfDNA. Plasma harvesting should be carried out no later than 4 hours after blood sampling (this is necessary to prevent enrichment of the cfDNA fraction of the mother’s genomic DNA from the degrading mother blood cells).

Заготовка плазмыPlasma blank

Заготовка плазмы может быть реализована известным способом. В частности, для заготовки плазмы необходимо провести первое центрифугирование 9 мл пробирок 1.600g, 10 минут, при +4°C для отделения фракции плазмы, богатой клетками. После проведения центрифугирования верхнюю фазу (верхнюю часть) переносят в несколько охлажденных во льду пробирок на 2 мл, не затрагивая интерфазу, в ней могут находиться клетки крови матери. Пробирки подписывают в соответствии с маркировкой первоначального образца. Далее проводят второе центрифугирование 2 мл пробирок при 16.000g, 10 минут, при +4°C для отделения оставшихся в плазме фрагментов клеток. Супернатант переносят в охлажденные 2 мл LoBind пробирки (DNA LoBind Tube 2,0 ml (Eppendorf AG, Cat. no.: 022431048)). Супернатант необходимо отбирать аккуратно, не задевая небольшой осадок клеток. Пробирки подписывают в соответствии с маркировкой первоначального образца.The preparation of plasma can be implemented in a known manner. In particular, for plasma preparation, the first centrifugation of 9 ml 1.600g tubes, 10 minutes, at + 4 ° C is necessary to separate the plasma-rich plasma fraction. After centrifugation, the upper phase (upper part) is transferred to several 2 ml ice-cooled tubes, without affecting the interphase, it may contain maternal blood cells. Test tubes are signed in accordance with the labeling of the original sample. Next, a second centrifugation of 2 ml tubes at 16.000g, 10 minutes, at + 4 ° C is carried out to separate the remaining cell fragments in the plasma. The supernatant was transferred to chilled 2 ml LoBind tubes (DNA LoBind Tube 2.0 ml (Eppendorf AG, Cat. No .: 022431048)). The supernatant must be selected carefully, without touching the small cell pellet. Test tubes are signed in accordance with the labeling of the original sample.

Выделение свободно циркулирующей ДНК из крови.Isolation of freely circulating DNA from the blood.

Выделение вкДНК из плазмы проводят согласно стандартному протоколу QIAamp Circulating Nucleic Acid Kit (Catalog no. 55114).Isolation of cfDNA from plasma is carried out according to the standard QIAamp Circulating Nucleic Acid Kit protocol (Catalog no. 55114).

Добавление вырожденной молекулярной метки.Addition of a degenerate molecular label.

Методика включает 2 этапа. На первом этапе к фрагментам вкДНК, выделенным из плазмы крови беременных женщин, производят отжиг праймеров группы «Univ-N-spec». Праймеры из группы «Univ-N-spec» структурно состоят из 3-х частей - «Univ», «N» и «spec» (от 3' к 5' концу). Часть «Univ» представляет собой универсальную нуклеотидную последовательность, одинаковую для всей группы праймеров «Univ-N-spec», на следующем этапе праймеры группы «S» будут отжигаться на именно на его последовательность. Часть «N» включает несколько случайно выбранных нуклеотидов, для каждого праймера из набора эта последовательность разная, в дальнейшем, после проведения ПЦР ампликоны, у которых последовательность «N» будет одинаковая, будут считаться ПЦР-дупликатами, набор ПЦР-дупликатов будут учитываться, как 1 чтение при подсчете покрытия регионов. Часть «spec» представляет собой специфическую последовательность, комплементарную месту посадки форвард праймера в регионах генома, характеризующихся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%. На втором этапе проводят целевую амплификацию регионов генома, характеризующихся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%. В реакции используют праймеры групп «S» и «R». Праймеры группы «S» комплементарны универсальной нуклеотидной последовательности «Univ», внесенной в фрагменты вкДНК на первом этапе. Праймеры группы «R» представляют собой специфическую последовательность, комплементарную месту посадки реверс праймера в выбранных регионах генома.The technique includes 2 stages. At the first stage, primers of the Univ-N-spec group are annealed to cfDNA fragments isolated from the blood plasma of pregnant women. Primers from the Univ-N-spec group are structurally composed of 3 parts — Univ, N, and spec (from 3 'to 5' end). The “Univ” part is a universal nucleotide sequence that is the same for the entire group of “Univ-N-spec” primers; at the next stage, the “S” primers will be annealed to precisely its sequence. Part “N” includes several randomly selected nucleotides, for each primer from the set, this sequence is different, in the future, after PCR, amplicons in which the sequence “N” is the same will be considered as PCR duplicates, the set of PCR duplicates will be taken into account as 1 reading when calculating the coverage of regions. The “spec” part is a specific sequence complementary to the site of forward primer insertion in the regions of the genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%. At the second stage, target amplification of the genome regions is carried out, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%. The reaction uses primers of groups "S" and "R". The primers of the “S” group are complementary to the universal nucleotide sequence “Univ” introduced into the fragments of cfDNA in the first stage. The “R” group primers are a specific sequence complementary to the site of primer reverse primer placement in selected regions of the genome.

Благодаря наличию в «Univ-N-spec» праймере последовательности, комплементарной регионам генома вкДНК, характеризующимся открытостью хроматина между плацентой и клетками крови матери, отличающейся не менее чем на 20%, происходит отбор нужных регионов генома, из которых будут приготовлены геномные библиотеки. Это позволяет проводить секвенирование небольшого участка генома - выбранных заранее регионов генома, что существенно сокращает время, необходимое для проведения теста (с 1-2 недель до 3 дней).Due to the presence in the “Univ-N-spec” primer of a sequence complementary to the regions of the cfDNA genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, the required regions of the genome are selected from which genomic libraries will be prepared. This allows sequencing of a small part of the genome - pre-selected regions of the genome, which significantly reduces the time required for the test (from 1-2 weeks to 3 days).

Дополнительный этап внесения вырожденной метки позволяет пришить каждой уникальной молекуле вкДНК, содержащейся в образце до приготовления геномных библиотек, собственную вырожденную метку, которая позволит после проведения секвенирования удалить ПЦР-дупликаты и рассчитать истинное покрытие каждого региона. Проведение специфической ПЦР амплификации выбранных регионов генома позволяет получить набор ампликонов - молекул ДНК из целевых регионов генома. На следующем этапе геномные библиотеки готовят из полученных ампликонов, а не из тотальной вкДНК. При стандартной методике приготовления геномных библиотек все регионы генома равно представлены в геномной библиотеке. После проведения этапа специфической ПЦР амплификации выбранных регионов генома, более 95% фрагментов ДНК, входящих в геномную библиотеку, относятся к небольшому количеству регионов генома, занимающих, например, около 1,5-2% генома. То есть при условии предварительной амплификации целевых регионов генома представленность интересующих нас регионов среди всех фрагментов ДНК в библиотеки возрастает в среднем в 50-60 раз.An additional step of introducing a degenerate label allows each unique cfDNA molecule contained in the sample to be sewn up to prepare genomic libraries its own degenerate label, which allows sequencing to remove PCR duplicates and calculate the true coverage of each region. Conducting specific PCR amplification of selected regions of the genome allows you to get a set of amplicons - DNA molecules from the target regions of the genome. At the next stage, genomic libraries are prepared from the resulting amplicons, and not from total fcDNA. With the standard method for preparing genomic libraries, all regions of the genome are equally represented in the genomic library. After the stage of specific PCR amplification of the selected regions of the genome, more than 95% of the DNA fragments included in the genomic library belong to a small number of regions of the genome, occupying, for example, about 1.5-2% of the genome. That is, subject to preliminary amplification of the target regions of the genome, the representation of the regions of interest among all DNA fragments in the library increases on average 50-60 times.

Приготовление геномной библиотекиPreparation of the genomic library

Регион генома - часть последовательности ДНК или фрагмент молекулы ДНК, принадлежащей конкретному месту в геноме (место задается геномными координатами, например, обозначение chr21 32925263 32925495 обозначает, что часть молекулы ДНК расположена в геноме на 21 хромосоме, начинается с 32925263 нуклеотида и заканчивается на 32925495 нуклеотиде).Genome region - a part of a DNA sequence or a fragment of a DNA molecule that belongs to a specific place in the genome (the place is given by genomic coordinates, for example, the designation chr21 32925263 32925495 means that part of the DNA molecule is located on the 21st chromosome, starts with 32925263 nucleotides and ends with 32925495 nucleotides )

Геномная библиотека - приготовленный особым образом образец ДНК, доступный для чтения на секвенаторе. Стандартная процедура приготовления геномных библиотек включает в себя следующие операции с молекулами ДНК: фрагментацию, достройку концов, лигирование адаптеров, отбор по длине и ПЦР-амплификацию.Genomic library - a specially prepared DNA sample, available for reading on a sequencer. The standard procedure for preparing genomic libraries includes the following operations with DNA molecules: fragmentation, completion of the ends, ligation of adapters, length selection and PCR amplification.

Согласно настоящему изобретению, в данную процедуру внесены изменения - исключен этап фрагментирования (так как внеклеточная ДНК представлена короткими молекулами ДНК и не требует дополнительного фрагментирования).According to the present invention, changes have been made to this procedure - the fragmentation step is excluded (since extracellular DNA is represented by short DNA molecules and does not require additional fragmentation).

СеквенированиеSequencing

Далее полученные геномные библиотеки подвергают секвенированию. Секвенирование проводят на секвенаторах нового поколения, которые дают возможность определять нуклеотидную последовательность большого количества (от сотен до сотен миллионов) чтений за 1 запуск прибора, согласно стандартному протоколу. Частными примерами технологий (приборов), которые могут быть использованы, являются: секвенирование синтезом на молекулярных колониях (Genome Analyzer, HiSeq, MiSeq (Illumina)), лигазное секвенирование с использованием эмульсионного ПЦР (SOLiD4, 5500-series (Life Technologies)), полупроводниковое секвенирование (Ion Torrent, Ion Proton (Life Technologies)), пиросеквенирование (454 (Roche)) и т.д. Заявляемый способ не ограничивается перечисленными технологиями (приборами) секвенирования. Результатом секвенирования геномных библиотек является получение нуклеотидной последовательности всех фрагментов, составляющих секвенируемую геномную библиотеку.Next, the resulting genomic libraries are subjected to sequencing. Sequencing is carried out on new generation sequencers, which make it possible to determine the nucleotide sequence of a large number (from hundreds to hundreds of millions) of readings per 1 start of the device, according to the standard protocol. Particular examples of technologies (devices) that can be used are: sequencing by synthesis on molecular colonies (Genome Analyzer, HiSeq, MiSeq (Illumina)), ligase sequencing using emulsion PCR (SOLiD4, 5500-series (Life Technologies)), semiconductor sequencing (Ion Torrent, Ion Proton (Life Technologies)), pyrosequencing (454 (Roche)), etc. The inventive method is not limited to the listed technologies (devices) sequencing. The result of sequencing genomic libraries is to obtain the nucleotide sequence of all the fragments that make up the sequenced genomic library.

Нуклеотидная последовательность каждого фрагмента геномной библиотеки, определенная с помощью секвенирования, называется чтением или ридом.The nucleotide sequence of each fragment of the genomic library, determined by sequencing, is called reading or reading.

Для всех полученных чтений определяются их координаты в геноме. Этот процесс называется картированием и выполняется с использованием стандартного программного обеспечения (например, можно использовать программу BWA, Bowtie). Чтение с определенными геномными координатами называется картированным чтением.For all readings obtained, their coordinates in the genome are determined. This process is called mapping and is performed using standard software (for example, you can use the BWA program, Bowtie). Reading with specific genomic coordinates is called mapped reading.

Удаление ПЦР-дупликатовRemoval of PCR duplicates

После определения координат каждого чтения все полученные данные проходят фильтрацию по принципу, если чтения имеют:After determining the coordinates of each read, all received data is filtered according to the principle if the readings have:

а) одинаковые координаты начала чтений и конца чтений иa) the same coordinates of the start of readings and the end of readings and

б) одинаковую вырожденную метку -b) the same degenerate label -

из данных удаляются все копии, кроме одной, таких чтений.all copies, except one, of such readings are deleted from the data.

Эта операция может быть проведена с использование стандартного программного продукта NuDup (Nugen).This operation can be carried out using the standard software product NuDup (Nugen).

Определение покрытия регионов геномаDetermination of coverage of genome regions

Среди картированных чтений выбираются те, которые пересекаются с исследуемыми регионами генома, то есть имеющие такие координаты по референсному геному, которые перекрываются с координатами исследуемых регионов. Для каждой позиции внутри региона (каждая позиция чтения - каждый следующий нуклеотид чтения) вычисляется ее покрытие - количество прочтений, приходящихся на данную позицию. Затем по всем позициям каждого региона вычисляется среднее значение покрытия.Among the mapped readings, those are selected that intersect with the studied regions of the genome, that is, having coordinates on the reference genome that overlap with the coordinates of the studied regions. For each position within the region (each reading position - each subsequent reading nucleotide), its coverage is calculated - the number of reads per given position. Then, the average coverage value is calculated for all positions in each region.

Среднее покрытие нормируется на суммарное количество картированных чтений в образце.The average coverage is normalized to the total number of mapped readings in the sample.

Выбор регионов геномаSelection of genome regions

Предварительно, до этапа обогащения геномных библиотек необходимо провести выбор регионов генома, которые будут использованы для определения анеуплоидий с помощью описываемого метода. Сначала выбирают регионы генома, соответствующие описанным ниже критериям, после чего не менее 10 регионов генома, выбранных случайно из полученного списка регионов генома, формируют набор регионов генома, использующийся для последующего анализа.Previously, prior to the enrichment stage of genomic libraries, it is necessary to select the regions of the genome that will be used to determine aneuploidy using the described method. First, regions of the genome that match the criteria described below are selected, after which at least 10 regions of the genome, randomly selected from the resulting list of regions of the genome, form a set of regions of the genome that is used for subsequent analysis.

При выборе регионов генома, вычисляют распределения значений их покрытия в образцах с нормальной беременностью и с анеуплоидией плода из обучающей выборки. Полученные значения покрытия регионов генома в образцах с нормальной беременностью и с анеуплоидией плода формируют базу данных покрытий кандидатных регионов генома для образцов крови беременных женщин с эуплоидией и анеуплоидией.When choosing regions of the genome, the distributions of their coverage values in samples with normal pregnancy and fetal aneuploidy from the training sample are calculated. The obtained values of the coverage of genome regions in samples with normal pregnancy and with fetal aneuploidy form a database of coatings of candidate genome regions for blood samples of pregnant women with euploidy and aneuploidy.

После формирования базы данных покрытий кандидатных регионов генома для образцов крови беременных женщин с эуплоидией и анеуплоидией, выбирают регионы генома, которые имеют значимо разную открытость хроматина между плацентой и другими тканями. В качестве меры открытости хроматина рассматривают данные о покрытии регионов генома прочтениями после обработки ДНКазой (фермент, который разрезает преимущественно открытую ДНК, не связанную с нуклеосомами), опубликованные в проекте ENCODE (https://genome.ucsc.edu/ENCODE/downloads.html). В проекте ENCODE также опубликованы пики, которые детектируются как пики в покрытии ридами после обработки ДНКазами в образцах плаценты и не детектируются в образцах клеток крови.After forming a database of coatings of candidate regions of the genome for blood samples of pregnant women with euploidy and aneuploidy, regions of the genome that have significantly different chromatin openness between the placenta and other tissues are selected. As a measure of chromatin openness, we consider data on coverage of genome regions with readings after treatment with DNase (an enzyme that cuts mostly open DNA that is not associated with nucleosomes), published in the ENCODE project (https://genome.ucsc.edu/ENCODE/downloads.html ) The ENCODE project also published peaks that are detected as peaks in the readings after treatment with DNases in placenta samples and are not detected in blood cell samples.

После нахождения регионов генома на хромосоме 21, являющихся участками гиперчувствительности в крови, но не являющихся участками гиперчувствительности в плаценте, формируют набор регионов генома, в который может входить разное подмножество регионов генома (не менее 10 регионов генома). Для поиска наилучшего подмножества участков используют полногеномный сиквенс свободно циркулирующей ДНК для образцов трисомии плода по хромосоме 21 и с нормальной беременностью. Для каждого кандидатного региона вычисляют его покрытие прочтениями в каждом образце. При помощи пакета DESeq, обычно используемого для анализа дифференциальной экспрессии генов, для каждого кандидатного региона определяют p-value того, что существует значимое отличие в покрытии отрезка между образцами с трисомией плода и эуплоидией плода. Выбирают участки с наибольшим отличием между образцами с трисомией плода и с нормальной беременностью, такие, что покрытие при трисомии превышает покрытие при нормальной беременности.After finding regions of the genome on chromosome 21, which are areas of hypersensitivity in the blood, but not areas of hypersensitivity in the placenta, a set of regions of the genome is formed, which may include a different subset of the regions of the genome (at least 10 regions of the genome). To find the best subset of sites, a full-genome sequence of freely circulating DNA is used for fetal trisomy samples on chromosome 21 and with normal pregnancy. For each candidate region, its coverage by readings in each sample is calculated. Using the DESeq package, usually used to analyze differential gene expression, for each candidate region, the p-value of the fact that there is a significant difference in the coverage of the segment between samples with fetal trisomy and fetal euploidy is determined. Select the areas with the greatest difference between samples with fetal trisomy and normal pregnancy, such that the coverage with trisomy exceeds the coverage with normal pregnancy.

Для каждого из участков в каждом образце сохраняют значения его покрытия прочтениями, нормированные на покрытие всего образца в образцах с нормальной беременностью и трисомией плода. Такие участки генома затем используют для построения пренатального теста, при этом сохраненные значения покрытия используют для статистического анализа. Выбирались регионы генома, которые имеют значимо разную открытость хроматина между плацентой и другими тканями. Рассматривались полногеномные данные ENCODE о гиперчувствительности локусов ДНК к ферменту ДНКазе, которая разрезает преимущественно открытую ДНК, не связанную с нуклеосомами. В качестве меры открытости хроматина рассматривалось покрытие участка генома прочтениями после обработки ДНКазой, опубликованные в проекте ENCODE (https://genome.ucsc.edu/ENCODE/downloads.html).For each of the sites in each sample, the values of its coverage by readings are normalized, normalized to cover the entire sample in samples with normal pregnancy and fetal trisomy. Such sections of the genome are then used to construct the prenatal test, while the stored coverage values are used for statistical analysis. Regions of the genome were selected that have significantly different chromatin openness between the placenta and other tissues. ENCODE's full-genomic data on hypersensitivity of DNA loci to the DNase enzyme, which cuts open DNA primarily unrelated to nucleosomes, were examined. As a measure of chromatin openness, we considered covering the genome with readings after treatment with DNase published in the ENCODE project (https://genome.ucsc.edu/ENCODE/downloads.html).

При анализе геномных последовательностей (чтений) образца крови беременной женщины построенные предварительно распределения используют для определения того, насколько вероятно в образце наличие анэуплоидии или эуплоидии. Для каждого региона генома используют собственные распределения покрытия в обучающей выборке, по которым вычисляют p-value двух нулевых гипотез: «покрытие в данном участке соответствует анэуплоидии» и «покрытие в данном участке соответствует эуплоидии». Р-value вычисляют стандартным образом как вероятность наблюдать более экстремальное (сильнее смещенное от среднего значения) значение относительного покрытия в соответствии с используемым распределением. Полученные P-value для каждого участка хромосомы, тестируемой на анеуплоидию, перемножают отдельно для одной и другой нулевой гипотезы. Таким образом, вычисляют условные вероятности наблюдать полученные значения покрытия при анеуплоидии плода по данной хромосоме и при эуплоидии: P(X|aneuploidy), Р(Х|euploidy), где X обозначает наблюдаемые в данном образце значения покрытия.When analyzing the genomic sequences (readings) of a pregnant woman’s blood sample, pre-built distributions are used to determine how likely aneuploidy or euploidy is in the sample. For each region of the genome, their own distribution distributions are used in the training sample, according to which the p-value of two null hypotheses is calculated: “the coverage in this area corresponds to aneuploidy” and “the coverage in this area corresponds to euploidy”. P-value is calculated in a standard way as the probability of observing a more extreme (more offset from the average value) value of the relative coverage in accordance with the distribution used. The obtained P-value for each region of the chromosome tested for aneuploidy is multiplied separately for one and the other null hypothesis. Thus, conditional probabilities are calculated to observe the obtained coating values during fetal aneuploidy on a given chromosome and for euploidy: P (X | aneuploidy), P (X | euploidy), where X denotes the coating values observed in this sample.

Вероятности наличия анеуплоидий и эуплоидии при условии полученных наблюдений могут быть вычислены по теореме Байеса следующим образом:The probabilities of the presence of aneuploidy and euploidy under the condition of the observations obtained can be calculated by Bayes theorem as follows:

Р(аneuрlоidy) - априорная вероятность наличия трисомии у плода оценивается как вероятность трисомии по исследуемой хромосоме в популяции с учетом возраста матери (например, 2*10^-3). Используется несколько завышенное значение вероятности анеуплоидий для минимизации риска постановки ложноотрицательного (false negative - то есть не определение трисомии в случае беременности с трисомией) диагноза.P (aneurloidy) - the a priori probability of the presence of trisomy in the fetus is estimated as the probability of trisomy by the studied chromosome in the population, taking into account the age of the mother (for example, 2 * 10 ^-3 ). A slightly overestimated probability of aneuploidy is used to minimize the risk of making a false negative diagnosis (that is, not determining trisomy in pregnancy with trisomy).

Полученные вероятности P(aneuploidy|X) и P(euploidy|X), в отличие от статистических метрик, используемых в других методиках определения трисомии, позволяют производить постановку диагноза без предварительного поиска оптимальных порогов на значение какой-либо метрики. Для постановки одного из диагнозов достаточно, чтобы одна из вероятностей была бы меньше порога значимости (например, 0.05), а другая - больше, тогда отвергается альтернатива с низкой вероятностью и принимается - с высокой. В случае, если ни одна из альтернатив не отвергается и обе вероятности выше порога, метод определяет невозможность постановки диагноза (no call).The obtained probabilities P (aneuploidy | X) and P (euploidy | X), in contrast to the statistical metrics used in other methods of determining trisomy, make it possible to make a diagnosis without first searching for optimal thresholds for the value of any metric. To make one of the diagnoses, it is enough that one of the probabilities is less than the significance threshold (for example, 0.05) and the other is greater, then the alternative with a low probability is rejected and accepted with a high probability. If neither of the alternatives is rejected and both probabilities are above the threshold, the method determines the impossibility of making a diagnosis (no call).

Такая особенность также является преимуществом метода по сравнению с аналогами, так как позволяет отказаться от постановки диагноза в случае невозможности сделать это надежно вместо постановки малодостоверного диагноза.This feature is also an advantage of the method compared to analogues, as it allows you to refuse to make a diagnosis if it is impossible to do it reliably instead of making a poorly reliable diagnosis.

Примеры осуществление изобретенияExamples of carrying out the invention

Пример №1. Сбор материалаExample No. 1. Material collection

У женщины, проходящей пренатальную генетическую диагностику на 11-й неделе беременности, была собрана кровь в пробирки с ЭДТА, объемом 9 мл. Кровь хранили не более трех часов при +4°C. Не позднее, чем через три часа после флеботомии, пробирки с кровью центрифугировали в течение 10 мин при 2000g при +4°C для получения плазмы, богатой тромбоцитами. Далее плазму повторно центрифугировали в течение 15 мин при 16000g при +4°C для получения плазмы, свободной от целых клеток крови. Внеклеточную ДНК получали из очищенной плазмы крови с помощью набора реактивов QIAamp Circulating Nucleic Acid Kit (Qiagen), руководствуясь инструкцией к набору. Концентрацию полученной вкДНК определяли с помощью флюориметра Qubit 2.0 (Life Technologies).A woman undergoing prenatal genetic diagnosis at the 11th week of pregnancy had blood collected in 9 ml EDTA tubes. Blood was stored for no more than three hours at + 4 ° C. Not later than three hours after phlebotomy, blood tubes were centrifuged for 10 min at 2000g at + 4 ° C to obtain platelet rich plasma. The plasma was then centrifuged again for 15 min at 16000g at + 4 ° C to obtain a plasma free of whole blood cells. Extracellular DNA was obtained from purified blood plasma using the QIAamp Circulating Nucleic Acid Kit (Qiagen) reagent kit, following the kit instructions. The concentration of the obtained cfDNA was determined using a Qubit 2.0 fluorimeter (Life Technologies).

Пример №2. Лабораторная процедура подготовки образцов и секвенирование геномных библиотекExample No. 2. Laboratory procedure for sample preparation and sequencing of genomic libraries

На этапе внесения вырожденной метки в реакцию брали 20 нг вкДНК, выделенной из плазмы крови. Предварительно проводили денатурацию вкДНК в буфере для полимеразы (NEBuffer 2) с добавлением 1 мкл праймера Univ-N-specl при 95°C 5 мин. После реакционную смесь переносили в лед на 5 мин для отжига праймера. Далее в реакционную смесь добавляли нуклеотиды и помещали на 37°C, 30 мин для достройки комплементарной цепи. После достройки цепи полученную смесь чистили с помощью AMPure, используя 1,8 V шариков по стандарному протоколу (Agencourt AMPure ХР Bead) для очистки, и переходили к этапу проведения ПЦР-амплификации. Для проведения амплификации брали 10 мкл PCR master mix (NEBNext® High-Fidelity 2X PCR Master Mix), 1 мкл праймера «S1» и 1 мкл праймера «R1», 8 мкл очищенного образца. Амплификацию проводили по программе: предварительная денатурация 98°C 30 сек, и 20 циклов: денатурация 98°C 10 сек, отжиг праймеров 65°C, 30 сек, достройка цепи 72°C 30 сек, и последний этап окончательной достройки цепи 72°C, 5 минут, после чего хранение на 4°C.At the stage of introducing a degenerate label into the reaction, 20 ng of cfDNA isolated from blood plasma was taken. The cfDNA was previously denatured in a polymerase buffer (NEBuffer 2) with the addition of 1 μl of Univ-N-specl primer at 95 ° C for 5 min. After the reaction mixture was transferred to ice for 5 min to anneal the primer. Then, nucleotides were added to the reaction mixture and placed at 37 ° C for 30 min to complete the complementary chain. After chain completion, the resulting mixture was purified using AMPure using 1.8 V beads according to the standard protocol (Agencourt AMPure XP Bead) for purification, and the procedure for PCR amplification was passed. For amplification, 10 μl of PCR master mix (NEBNext® High-Fidelity 2X PCR Master Mix), 1 μl of S1 primer and 1 μl of R1 primer, 8 μl of a purified sample were taken. Amplification was performed according to the program: preliminary denaturation of 98 ° C for 30 sec, and 20 cycles: denaturation of 98 ° C for 10 sec, annealing of primers 65 ° C, 30 sec, completion of the chain 72 ° C for 30 sec, and the last stage of the final completion of the chain 72 ° C , 5 minutes, after which storage at 4 ° C.

Пример праймеровPrimer Example

Группа «Univ-N-spec»:Univ-N-spec group:

Группа «S»:Group "S":

Группа «R»R group

После проведения ПЦР - ампликоны чистили с помощью AMPure, используя 1,8 V шариков по стандарному протоколу (Agencourt AMPure ХР Bead), и брали весь объем, полученный после очистки (10 мкл) для приготовления библиотек. Приготовление библиотеки проводили с помощью наборов реактивов, совместимых с платформой Illumina: NEBNext DNA library prep reagent set for Illumina и NEBNext multiplex oligos for Illumina (North England Biolabs), руководствуясь инструкциями к набору. Процедура приготовления библиотеки включала в себя достройку и затупление концов вкДНК, лигирование адаптеров (в течение 10 часов) и ПЦР-амплификацию (15 циклов). Концентрацию полученной библиотеки проверяли с помощью флюориметра Qubit 2.0 (Life Technologies), она составила 13,5 нг/мкл. Определение размера и качества приготовления библиотеки проводили с помощью прибора Bioanalyzer 2100 (Agilent), длина составила 290±30 п.н.After PCR, the amplicons were cleaned using AMPure using 1.8 V beads according to the standard protocol (Agencourt AMPure XP Bead), and the entire volume obtained after purification (10 μl) was taken to prepare the libraries. Library preparation was performed using reagent kits compatible with the Illumina platform: NEBNext DNA library prep reagent set for Illumina and NEBNext multiplex oligos for Illumina (North England Biolabs), guided by the instructions for the kit. The library preparation procedure included completion and blunting of the ends of cfDNA, ligation of adapters (within 10 hours), and PCR amplification (15 cycles). The concentration of the obtained library was checked using a Qubit 2.0 fluorimeter (Life Technologies), it was 13.5 ng / μl. The size and quality of the library were determined using a Bioanalyzer 2100 (Agilent) instrument; the length was 290 ± 30 bp.

Полученную библиотеку подвергали геномному секвенированию на приборе HiSeq 1500 (Illumina) с использованием проточной ячейки HiSeq Rapid SR (Illumina). В результате были получены чтения секвенированной библиотеки в формате *.fasta.The resulting library was subjected to genomic sequencing on a HiSeq 1500 instrument (Illumina) using a HiSeq Rapid SR flow cell (Illumina). As a result, readings of the sequenced library in the * .fasta format were obtained.

Пример №3. Определение наличия трисомииExample No. 3. Trisomy Determination

В результате проведения полногеномного секвенирования было получено 953 406 прочтений (файл в формате *.fasta). Чтения картировались с помощью программы bowtie2 на референсный геном человека hg19 для определения геномных координат. Чтения, для которых было невозможно определить геномные координаты, отбрасывались. Полученные чтения фильтровались для удаления ПЦР дупликатов. Из успешно картированных 762565 чтений после удаления дупликатов осталось 490364 чтений для дальнейшего анализа. Для каждого региона вычислялось среднее покрытие чтениями. Для каждого покрытия определялись вероятности наблюдать его или более экстремальное значение при условии нормального и анеуплоидийного кариотипа по 21 хромосоме. После перемножения значений и вычисления вероятностей наличия трисомии и нормального кариотипа (см. формулу в описании) были получены значения: P(aneuploidy|X)=0.12 и P(euploidy|X)=7*10^-5. Принято решение о наличии трисомии по 21 хромосоме у плода.As a result of genome-wide sequencing, 953,406 reads were obtained (file in * .fasta format). Readings were mapped using bowtie2 to the hg19 reference human genome to determine genomic coordinates. Readings for which it was impossible to determine the genomic coordinates were discarded. The reads were filtered to remove PCR duplicates. Of the successfully mapped 762565 readings after the duplicate removal, 490364 readings remained for further analysis. For each region, the average reading coverage was calculated. For each coating, the probabilities were determined to observe its or more extreme value under the condition of normal and aneuploidy karyotype on 21 chromosomes. After multiplying the values and calculating the probabilities of trisomy and the normal karyotype (see the formula in the description), the following values were obtained: P (aneuploidy | X) = 0.12 and P (euploidy | X) = 7 * 10 ^-5 . A decision was made on the presence of trisomy for 21 chromosomes in the fetus.

Пример №4. Поиск кандидатных регионов геномаExample No. 4. Search for candidate genome regions

Выбирались геномы, которые имеют значимо разную открытость хроматина между плацентой и другими тканями. Рассматривались полногеномные данные ENCODE о гиперчувствительности локусов ДНК к ферменту ДНКазе, которая разрезает преимущественно открытую ДНК, не связанную с нуклеосомами. В качестве меры открытости хроматина рассматривалось покрытие участка генома прочтениями после обработки ДНКазой, опубликованные в проекте ENCODE (https://genome.ucsc.edu/ENCODE/downloads.html). В проекте ENCODE также опубликованы пики, которые детектируются как пик в покрытии ридами после обработки ДНКазами в образцах плаценты и не детектируются в образцах клеток крови.Genomes were selected that have significantly different chromatin openness between the placenta and other tissues. ENCODE's full-genomic data on hypersensitivity of DNA loci to the DNase enzyme, which cuts open DNA primarily unrelated to nucleosomes, were examined. As a measure of chromatin openness, we considered covering the genome with readings after treatment with DNase published in the ENCODE project (https://genome.ucsc.edu/ENCODE/downloads.html). The ENCODE project also published peaks that are detected as a peak in readings after treatment with DNases in placenta samples and are not detected in blood cell samples.

Были найдены 1500 участков на хромосоме 21 (из 124000 участков на всех хромосомах), являющихся участками гиперчувствительности в крови, но не являющиеся участками гиперчувствительности в плаценте. Данный набор или его подмножество можно использовать для дальнейшего анализа.1,500 sites were found on chromosome 21 (out of 124,000 sites on all chromosomes), which are areas of hypersensitivity in the blood, but not areas of hypersensitivity in the placenta. This kit or a subset of it can be used for further analysis.

Пример №5. Выбор среди регионов на хромосоме 21 с дифференциальной доступностью хроматина таких регионов, покрытие которых значимо отличается между образцами трисомии и нормы.Example No. 5. The choice among regions on chromosome 21 with differential chromatin availability of regions whose coverage significantly differs between trisomy and norm samples.

Использовались полногеномные сиквенсы свободно циркулирующей ДНК для образцов трисомии плода по хромосоме 21 и с нормальной беременностью. Для каждого кандидатного региона вычислялось его покрытие прочтениями в каждом образце. При помощи пакета DESeq, обычно используемого для анализа дифференциальной экспрессии генов, для каждого кандидатного региона определялось p-value того, что существует значимое отличие в покрытии отрезка между образцами с трисомией плода и эуплоидией плода. Выбиралось 100 участков с наибольшим отличием между образцами с трисомией плода и с нормальной беременностью, таких, что покрытие при трисомии превышало покрытие при нормальной беременности. Такие участки генома были использованы для построения пренатального теста.We used the whole genome sequences of freely circulating DNA for fetal trisomy samples on chromosome 21 and with normal pregnancy. For each candidate region, its coverage by readings in each sample was calculated. Using the DESeq package, usually used to analyze differential gene expression, for each candidate region, a p-value was determined that there is a significant difference in the coverage of the segment between samples with fetal trisomy and fetal euploidy. 100 sites were selected with the greatest difference between samples with fetal trisomy and normal pregnancy, such that the coverage with trisomy exceeded the coverage during normal pregnancy. Such sections of the genome were used to construct a prenatal test.

Claims

1. The method of non-invasive prenatal diagnosis of fetal aneuploidy, including:

a. the extraction of extracellular DNA (cfDNA) from a blood sample obtained from a pregnant woman;

b. selection of genome regions for amplification, before preparing genomic libraries;

c. introducing molecular labels to the cfDNA fragments containing a degenerate nucleotide sequence, a universal sequence and a sequence complementary to regions of the cfDNA genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%;

d. the molecular cdNA fragments obtained in step c) are amplified using primers annealed onto a universal sequence of molecular labels on the one hand and specific primers annealed to regions of the cfDNA genome characterized by chromatin openness between the placenta and the mother’s blood cells, which differs by no less than by 20%;

e. preparation of genomic libraries from amplicons obtained in step d);

f. determination of the nucleotide sequence of the resulting genomic libraries;

g. mapping of the obtained sequences to the reference genome or parts of the human genome with the determination of their coordinates;

h. removal of PCR duplicates from mapped sequences;

i. determining the coverage value for each region of the genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, and obtaining the regions of the genome with the indicated openness of chromatin between the placenta and the mother’s blood cells, by not less than 20%;

j. adjustment of the obtained coverage value for each region of the genome obtained in step i) to the total coverage of the genome, followed by comparison of the adjusted coverage value with the values of coatings or their distributions obtained for the training sample of blood samples of pregnant women with euploidy and aneuploidy of the fetus and determining the identity of the subject sample to one of these groups, which conclude the presence of fetal aneuploidy.

2. The method according to p. 1, characterized in that the choice of regions of the genome in step b) includes determining a sequence complementary to regions of the cfDNA genome, characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, from the database coatings of candidate regions of the genome for blood samples of pregnant women with euploidy and aneuploidy, while the significance of the difference in coverage between samples with eu- and aneuploidy for each candidate region is calculated, characterized by p-value taking into account the adjustment for the total coverage of the sample, and those regions that are characterized by a p-value of not more than 0.1 are selected from the candidate regions of the genome.

3. The method according to p. 2, characterized in that the determination of the belonging of the sample to the group with euploidy or aneuploidy of the fetus is as follows:

a. for each region characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, a p-value is calculated that determines the probability of observing the obtained coating value or a more extreme value, provided that this value corresponds to the distribution of coatings for pregnancy without aneuploidy, and the p-value of the fact that its coating is obtained from the distribution of coatings for pregnancy with aneuploidy on this chromosome according to the database of coatings of candidate regions of the genome for blood samples b belt women euploidiey and aneuploidy;

b. calculating the product for all regions of the obtained p-values to calculate the p-value of the fact that the coverage values for regions characterized by chromatin openness between the placenta and the mother’s blood cells, differing by at least 20%, are obtained from the distribution of coatings for pregnancy without aneuploidy, and p-value of the fact that the coverage values of the regions are obtained from the distribution of coatings for pregnancy with aneuploidy on this chromosome;

c. according to the obtained p-value products and the a priori probabilities of the presence of aneuploidy in the fetus, the probabilities of the presence of aneuploidy or euploidy in the test sample are calculated by the Bayes theorem.

4. The method according to p. 3, characterized in that the conclusion about the presence or absence of fetal aneuploidy is made if the probability for one of the diagnosis options does not exceed the significance threshold from the interval 0.01-0.1, and the probability for another variant exceeds the significance threshold , and the diagnosis is made according to the highest probability value, and if both p-values are higher or both are lower than the significance threshold, no diagnosis is made.

5. A method of obtaining regions of the genome characterized by the openness of chromatin between the placenta and the mother’s blood cells, which differs by at least 20%, for non-invasive prenatal diagnosis of fetal aneuploidy according to claim 1 by maternal blood cfDNA by sequencing, including

a. obtaining maternal blood cfDNA sequencing data so that all candidate regions of the genome, characterized by chromatin openness between the placenta and maternal blood cells and differing by at least 20%, are read for blood samples of several pregnant women without fetal aneuploidy and several pregnant women with fetal aneuploidy by specific chromosome;

b. mapping readings to the reference human genome to determine their coordinates;

c. determination of the coverage of each candidate region of each sample received;

d. calculation for each region of the significance of the difference, characterized by the p-value of the coating between samples with the fetus without aneuploidy and samples with aneuploidy of the fetus on a particular chromosome, taking into account adjustments for the total coverage of the sample;

e. selection from candidate regions of the genome of regions characterized by a p-value of not more than 0.1.

6. The method according to claim 5, characterized in that step d) is carried out under the assumption of a negative binomial distribution of the region coverage in the sample, for example, using DESeq RNA differential expression software.

7. The method according to p. 5, characterized in that for the regions found in paragraph d), similarly to paragraphs b) -e), the coverage of readings of at least 5 men in blood samples is calculated and for each region the significance expressed in the form p- value differences between DNA coating in samples of men and pregnant women with a fetus without aneuploidy, select sites with a p-value of not more than 0.1.