TW202242145A - Transcription factor binding site analysis of nucleosome depleted circulating cell free chromatin fragments - Google Patents

Transcription factor binding site analysis of nucleosome depleted circulating cell free chromatin fragments Download PDF

Info

Publication number
TW202242145A
TW202242145A TW110149003A TW110149003A TW202242145A TW 202242145 A TW202242145 A TW 202242145A TW 110149003 A TW110149003 A TW 110149003A TW 110149003 A TW110149003 A TW 110149003A TW 202242145 A TW202242145 A TW 202242145A
Authority
TW
Taiwan
Prior art keywords
dna
subject
binding agent
sample
disease
Prior art date
Application number
TW110149003A
Other languages
Chinese (zh)
Inventor
雅各 文森 米考夫
馬克 愛德華 愛寇斯頓
Original Assignee
比利時商比利時意志有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 比利時商比利時意志有限公司 filed Critical 比利時商比利時意志有限公司
Publication of TW202242145A publication Critical patent/TW202242145A/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/5308Immunoassay; Biospecific binding assay; Materials therefor for analytes not provided for elsewhere, e.g. nucleic acids, uric acid, worms, mites
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57484Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites
    • G01N33/57488Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites involving compounds identifable in body fluids
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6875Nucleoproteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material

Abstract

The invention relates to methods for detecting disease in a subject by means of a minimally invasive body fluid test for non-nucleosomal cell free DNA fragments. The invention also relates to the measurement or detection of circulating cell free DNA fragments that include a transcription factor binding site sequence as an indicator of the presence of disease in a subject.

Description

核小體耗盡循環無細胞染色質片段之轉錄因子結合位點分析Transcription factor binding site analysis of nucleosome-depleted circular cell-free chromatin fragments

本發明涉及一種透過微創血液測試檢測主體疾病的方法,以檢測無細胞DNA片段的轉錄因子佔據度。The present invention relates to a method for detecting disease in a subject through a minimally invasive blood test to detect transcription factor occupancy of cell-free DNA fragments.

癌症是一種死亡率很高的常見疾病。該疾病的生物學被理解為涉及從癌前狀態到第I、II、III期和最終第IV期癌症的進展。對於大多數癌症疾病,死亡率差異很大,這取決於疾病是在早期局部階段、有效治療方案可用時檢測到,還是在疾病可能已經擴散到受影響器官內或之外、更難治療時的晚期階段發現。晚期癌症有各式各樣的症狀,包括可見的便血、尿血、咳嗽排出的血液、陰道排出的血液、不明原因的體重減輕、持續不明原因的腫塊(例如在乳房中)、消化不良、吞嚥困難、變化疣或痣以及許多其他可能的症狀,具體取決於癌症類型。然而,由於這些症狀而診斷出的大多數癌症已經處於晚期並且難以治療。大多數癌症在早期時無症狀或出現無助於診斷的非特異性症狀。因此,理想情況下,應使用癌症檢測及早發現癌症。Cancer is a common disease with a high mortality rate. The biology of the disease is understood to involve the progression from a precancerous state to stage I, II, III and eventually stage IV cancer. For most cancer diseases, mortality rates vary widely, depending on whether the disease is detected at an early local stage, when effective treatment options are available, or when the disease may have spread into or beyond affected organs, making it more difficult to treat detected at a late stage. Advanced cancer has a wide variety of symptoms, including visible blood in the stool, blood in the urine, blood from a cough, blood from the vagina, unexplained weight loss, persistent unexplained lumps (for example, in the breast), indigestion, difficulty swallowing , changing warts or moles, and many other possible symptoms, depending on the type of cancer. However, most cancers diagnosed due to these symptoms are already advanced and difficult to treat. Most cancers are asymptomatic or present with nonspecific symptoms that do not help diagnosis in their early stages. Therefore, cancer detection should ideally be used to detect cancer early.

為了滿足對簡單的常規癌症血液檢測的需求,許多血源性蛋白質已被研究為潛在的癌症生物標記物,包括用於CRC的癌胚抗原(CEA)、用於肝癌的甲型胎兒蛋白(AFP)、用於卵巢癌的CA125、用於胰腺癌的CA19-9、用於乳癌的CA15-3、用於前列腺癌的PSA。然而,它們的臨床準確性對於常規診斷使用來說太低了,它們被認為更適合用於監測患者。To meet the need for simple routine cancer blood tests, many blood-borne proteins have been investigated as potential cancer biomarkers, including carcinoembryonic antigen (CEA) for CRC, alpha-fetoprotein (AFP) for liver cancer ), CA125 for ovarian cancer, CA19-9 for pancreatic cancer, CA15-3 for breast cancer, and PSA for prostate cancer. However, their clinical accuracy is too low for routine diagnostic use and they are considered more suitable for monitoring patients.

最近,該領域的工作人員研究了循環腫瘤DNA(ctDNA)作為癌症檢測的血液生物標記物。無細胞DNA(cfDNA)作為染色質片段在血液中循環,這些片段被認為源於每天大量細胞的細胞死亡(主要是細胞凋亡)。在細胞凋亡過程中,染色質斷裂成單核小體和寡核小體,其中一些從細胞中釋放出來,以游離核小體的形式循環。每個循環無細胞核小體都與長度小於200 個鹼基對(bp)的小DNA片段相關聯。類似地,已從片段組學(fragmentomics)分析中推斷出循環中由DNA結合的轉錄因子或其他非組蛋白染色質蛋白組成的無細胞染色質片段。在健康主體中,循環染色質片段被認為是造血來源的,並且含量很低。在患有多種疾病(包括許多癌症、自身免疫疾病、發炎性疾病、中風和心肌梗塞)的主體中發現循環核小體含量的升高因此cfDNA片段含量的升高(Holdenrieder & Stieber, 2009)。Recently, workers in the field have investigated circulating tumor DNA (ctDNA) as a blood biomarker for cancer detection. Cell-free DNA (cfDNA) circulates in the blood as chromatin fragments thought to arise from cell death (mainly apoptosis) in large numbers of cells each day. During apoptosis, chromatin fragments into mononucleosomes and oligonucleosomes, some of which are released from the cell to circulate as free nucleosomes. Each circulating cell-free nucleosome is associated with small DNA fragments less than 200 base pairs (bp) in length. Similarly, circulating cell-free chromatin fragments composed of DNA-bound transcription factors or other nonhistone chromatin proteins have been inferred from fragmentomics analysis. In healthy subjects, circulating chromatin fragments are thought to be of hematopoietic origin and are present at very low levels. Elevated levels of circulating nucleosomes and thus cfDNA fragments are found in subjects with a variety of diseases, including many cancers, autoimmune diseases, inflammatory diseases, stroke, and myocardial infarction (Holdenrieder & Stieber, 2009).

癌症患者血液中的至少一些cfDNA被認為源自正在死亡或死亡的癌細胞將核小體和其他染色質片段釋放到循環中(即,cfDNA包括一些ctDNA)。對於癌症患者匹配的血液和組織樣品的研究表明,癌症相關突變存在於患者的腫瘤中(但不存在於他/她的健康細胞中),且亦存在於取自同一患者的血液樣品中的cfDNA中(Newman et al, 2014)。類似地,癌細胞中差異甲基化(透過胞嘧啶殘基的甲基化而產生表觀遺傳上的改變)的DNA序列亦可在循環中的cfDNA中檢測為甲基化序列。此外,由ctDNA組成的循環cfDNA的比例與腫瘤負荷有關,因此可以透過ctDNA存在的比例定量地監測疾病進展,以及透過其遺傳和/或表觀遺傳組成定性地監測疾病進展。對ctDNA的分析可以產生非常有用和臨床上準確的數據,這些數據與源自腫瘤內所有或許多不同群落(tumor clone)的DNA相關,因此在空間上整合了腫瘤群落。此外,與例如重複組織活檢比較,隨著時間的推移重複採血是更實用和經濟的選擇。ctDNA分析有可能徹底改變腫瘤的檢測和監測,以及透過研究腫瘤DNA在早期檢測復發和獲得性耐藥性以選擇腫瘤治療,而無需進行侵入性組織活檢程序。此ctDNA檢測可用於調查所有類型的癌症相關DNA異常(例如點突變、核苷酸修飾狀態、易位、基因複製數、微衛星異常和DNA鏈完整性),並可適用於常規癌症篩查、定期和更頻繁地監測和定期檢查最佳治療方案(Zhou et al, 2017)。 At least some cfDNA in the blood of cancer patients is thought to arise from the release of nucleosomes and other chromatin fragments into circulation by dying or dying cancer cells (ie, cfDNA includes some ctDNA). Studies of matched blood and tissue samples from cancer patients show that cancer-associated mutations are present in the patient's tumor (but not in his or her healthy cells) and also in cfDNA in blood samples taken from the same patient Medium (Newman et al , 2014). Similarly, DNA sequences that are differentially methylated (epigenetically altered through methylation of cytosine residues) in cancer cells can also be detected as methylated sequences in circulating cfDNA. Furthermore, the proportion of circulating cfDNA composed of ctDNA is associated with tumor burden, thus allowing disease progression to be monitored quantitatively by the proportion of ctDNA present and qualitatively by its genetic and/or epigenetic composition. Analysis of ctDNA can yield very useful and clinically accurate data that correlates with DNA originating from all or many different tumor clones within a tumor, thus spatially integrating tumor communities. Furthermore, repeated blood sampling over time is a more practical and economical option than, for example, repeated tissue biopsies. ctDNA analysis has the potential to revolutionize tumor detection and monitoring, as well as early detection of relapse and acquired drug resistance by studying tumor DNA for tumor treatment selection without the need for invasive tissue biopsy procedures. This ctDNA test can be used to investigate all types of cancer-associated DNA abnormalities (such as point mutations, nucleotide modification status, translocations, gene copy number, microsatellite abnormalities, and DNA strand integrity) and can be applied in routine cancer screening, Regular and more frequent monitoring and periodic review of optimal treatment options (Zhou et al , 2017).

血漿通常用作ctDNA檢測的底物。從血漿中萃取cfDNA片段(包括任何ctDNA)(因此從與核小體、轉錄因子或其他蛋白質的結合中去除),並且分析核苷酸鹼基序列。可以採用任何DNA分析方法,但通常透過使用次世代定序儀器(Next Generation Sequencer)的深度定序進行分析。Plasma is commonly used as a substrate for ctDNA detection. cfDNA fragments (including any ctDNA) are extracted from plasma (thus removed from binding to nucleosomes, transcription factors, or other proteins) and analyzed for nucleotide base sequence. Any method of DNA analysis can be used, but typically analysis is performed by deep sequencing using a Next Generation Sequencer.

由於DNA異常是所有癌症疾病的特徵,並且ctDNA已在所有已研究的癌症疾病中觀察到,因此ctDNA測試適用於所有癌症疾病。研究的癌症包括但不限於膀胱癌、乳癌、結腸直腸癌、黑色素瘤、卵巢癌、前列腺癌、肺肝癌、子宮內膜癌、卵巢癌、淋巴瘤、口腔癌、白血病、頭頸癌和骨肉瘤(Crowley et al, 2013; Zhou et al, 2017; Jung et al, 2010)。 Since DNA abnormalities are a feature of all cancer diseases, and ctDNA has been observed in all cancer diseases studied, ctDNA testing is applicable to all cancer diseases. Cancers studied include, but are not limited to, bladder cancer, breast cancer, colorectal cancer, melanoma, ovarian cancer, prostate cancer, lung and liver cancer, endometrial cancer, ovarian cancer, lymphoma, oral cancer, leukemia, head and neck cancer, and osteosarcoma ( Crowley et al , 2013; Zhou et al , 2017; Jung et al , 2010).

一種cfDNA分析的示例性方法涉及辨識一主體的cfDNA片段的組織或細胞來源。這種方法的基礎是循環中存在的所有cfDNA片段在細胞死亡或循環中都避免了核酸酶的消化,因為它們透過核小體內的蛋白質結合而免受核酸酶的作用。該方法涉及取自主體的血液樣品中 cfDNA的核小體片段化模式的確定,並在參考基因組中定位cfDNA片段的基因組位置。不同細胞類型的片段化模式不同,可用於識別主體的cfDNA的來源細胞。An exemplary method of cfDNA analysis involves identifying the tissue or cellular origin of cfDNA fragments in a subject. This approach is based on the assumption that all cfDNA fragments present in circulation are protected from nuclease digestion during cell death or circulation because they are protected from nucleases through protein binding within the nucleosome. The method involves determining the nucleosome fragmentation pattern of cfDNA in a blood sample taken from a subject and mapping the genomic location of the cfDNA fragments in a reference genome. Fragmentation patterns vary across cell types and can be used to identify the cell of origin of the subject's cfDNA.

這種方法涉及從血漿樣品中萃取cfDNA(包括任何ctDNA),並對DNA進行全基因組定序,以檢測cfDNA片段呈現的核小體結合DNA模式。cfDNA 片段的端點序列透過電腦分析使用生物訊息學在參考基因組或基因組中定位它們的基因組位置。參考基因組內 cfDNA端點的基因組位置提供了基因組的核小體保護的cfDNA覆蓋範圍圖譜。This approach involves extraction of cfDNA (including any ctDNA) from plasma samples and genome-wide sequencing of the DNA to detect nucleosome-bound DNA patterns presented by cfDNA fragments. The endpoint sequences of the cfDNA fragments are analyzed in silico using bioinformatics to map their genomic positions in a reference genome or genome. The genomic location of cfDNA endpoints within the reference genome provides a map of the nucleosome-protected cfDNA coverage of the genome.

透過如WO2017012592所述的使用生物訊息學的電腦分析,比較主體的核小體片段化模式與包含來自不同細胞來源的已知相對豐度的cfDNA的校準樣品,亦可確定不同細胞類型或組織對主體cfDNA的比例貢獻。By computer analysis using bioinformatics as described in WO2017012592, comparison of the nucleosome fragmentation pattern of a subject with a calibration sample comprising cfDNA from different cell sources of known relative abundance can also determine different cell types or tissue pairs. Proportional contribution of subject cfDNA.

與含有核小體的染色質片段相連的cfDNA片段的長度通常為120-200bp。然而,cfDNA 的蛋白質結合和保護並不侷限於核小體中cfDNA的組蛋白結合。其他cfDNA片段(包括活性基因啟動子序列),除了核小體外,還與轉錄因子、輔因子或其他非組蛋白染色質蛋白結合,或者在沒有任何核小體的情況下結合。在沒有核小體的情況下,這些蛋白質通常結合並保護35-80bp範圍內的較短cfDNA片段。然而,若所使用的DNA片段文庫製備方法(DNA fragment library preparation method)是適合長度小於100個鹼基對的短DNA片段的分離、擴增和定序,則只能透過實驗觀察到這些較短的cfDNA片段(Snyder et al, 2016)。 The cfDNA fragments attached to nucleosome-containing chromatin fragments are typically 120–200 bp in length. However, protein binding and protection of cfDNA is not limited to histone binding of cfDNA in nucleosomes. Other cfDNA fragments, including active gene promoter sequences, are associated with transcription factors, cofactors, or other non-histone chromatin proteins in addition to nucleosomes, or in the absence of any nucleosomes. In the absence of nucleosomes, these proteins typically bind and protect shorter cfDNA fragments in the 35–80 bp range. However, these shorter DNA fragments can only be observed experimentally if the DNA fragment library preparation method used is suitable for the isolation, amplification, and sequencing of short DNA fragments less than 100 base pairs in length. cfDNA fragments (Snyder et al , 2016).

所涉及的蛋白質結合可能是不同類型的。例如,一些cfDNA序列(包括一些無活性的DNA序列),是以核小體構象結合組蛋白。與含有核小體的染色質片段相連的cfDNA片段的長度通常約為120-200bp。其他cfDNA片段(包含活性基因啟動子序列)是與轉錄因子、輔因子或其他染色質蛋白結合,這些蛋白質通常結合並保護35-80bp範圍內的較短cfDNA片段。然而,若所用的DNA片段文庫製備方法適用於短片段的分離、擴增和定序,這些較短的cfDNA片段只能透過實驗觀察到。The protein binding involved may be of different types. For example, some cfDNA sequences (including some inactive DNA sequences) bind histones in a nucleosomal conformation. The length of cfDNA fragments attached to nucleosome-containing chromatin fragments is usually about 120-200 bp. Other cfDNA fragments (containing active gene promoter sequences) are bound to transcription factors, cofactors, or other chromatin proteins that typically bind and protect shorter cfDNA fragments in the 35-80 bp range. However, these shorter cfDNA fragments can only be observed experimentally if the DNA fragment library preparation method used is suitable for the isolation, amplification, and sequencing of short fragments.

由於不同的DNA序列(包括不同的啟動子序列和基因序列)在不同的細胞中具有活性,因此活細胞中DNA在基因組中的蛋白質結合模式因細胞類型而異。任何細胞類型中DNA的蛋白質結合模式可以透過核酸酶可及位點(Nuclease Accessible Site)作圖確定,方法是用核酸酶消化從細胞中萃取的染色質,並對所得蛋白質保護的染色質片段中未消化的DNA進行定序。因此,如果將血液中的cfDNA片段視為體內核酸酶消化的產物,則發現的cfDNA序列應該對應於cfDNA起源的細胞中的蛋白質結合DNA序列。因此,原則上,血液中cfDNA片段序列的模式應該與由核酸酶可及位點作圖產生的原始細胞的染色質片段序列模式相似。因此,可以使用生物訊息學方法將從血液樣品確定的cfDNA序列的片段化模式與已知組織或癌症類型的細胞的核酸酶可及位點分析產生的已知DNA片段化模式進行比較,以確定cfDNA的來源組織。從健康主體採集的樣品中的結果表明,cfDNA來源的細胞是造血細胞。這種方法在取自癌症患者的樣品中的結果表明,cfDNA和 ctDNA源自包括造血細胞和其他細胞在內的細胞混合物。在許多情況下,所指的非造血細胞類型與患者癌症疾病的組織相關(Snyder et al, 2016)。 Since different DNA sequences (including different promoter sequences and gene sequences) are active in different cells, the protein binding pattern of DNA in the genome in living cells varies by cell type. The protein binding pattern of DNA in any cell type can be determined by nuclease accessible site (Nuclease Accessible Site) mapping by digesting chromatin extracted from cells with nucleases and analyzing the resulting protein-protected chromatin fragments. Undigested DNA was sequenced. Therefore, if cfDNA fragments in blood are considered as products of nuclease digestion in vivo, the cfDNA sequences found should correspond to protein-bound DNA sequences in the cell from which the cfDNA originated. Thus, in principle, the pattern of cfDNA fragment sequences in blood should be similar to that of chromatin fragment sequences in primitive cells generated by mapping nuclease-accessible sites. Thus, fragmentation patterns of cfDNA sequences determined from blood samples can be compared to known DNA fragmentation patterns generated by nuclease-accessible site analysis of cells of known tissue or cancer types using bioinformatics methods to determine Source tissue of cfDNA. Results in samples collected from healthy subjects indicated that the cfDNA-derived cells were hematopoietic. Results of this approach in samples taken from cancer patients showed that cfDNA and ctDNA originate from a mixture of cells including hematopoietic and other cells. In many cases, the non-hematopoietic cell type referred to was tissue-associated with the patient's cancer disease (Snyder et al , 2016).

其他工作人員使用了類似的cfDNA片段端點分析方法,但將生物訊息學電腦分析重點放在轉錄因子結合位點(TFBS)序列上。這種方法的目的是確定TFBS的可及性,並在取自癌症患者的血漿樣品中識別具有改變的可及性的TFBS DNA序列(Ulz et al, 2019)。在這種方法中,從主體採集血漿樣品,並使用適用於長度小於100bp的小DNA片段的DNA文庫製備方法萃取和擴增cfDNA。使用次世代定序方法對DNA文庫進行定序。定序資料用於使用生物訊息學方法識別TFBS附近基因組區域中的cfDNA片段化模式。該分析涉及確定cfDNA片段在TFBS上的核小體定位譜及其在基因啟動子序列中的側翼序列(flanking sequence),以確定TFBS是否與含有cfDNA的染色質片段中的轉錄因子結合。該方法很複雜,但可以總結如下: Other workers have used a similar approach to end-point analysis of cfDNA fragments, but focused bioinformatics in silico analysis on transcription factor binding site (TFBS) sequences. The aim of this approach was to determine the accessibility of TFBS and to identify TFBS DNA sequences with altered accessibility in plasma samples taken from cancer patients (Ulz et al , 2019). In this approach, plasma samples are collected from subjects and cfDNA is extracted and amplified using a DNA library preparation method suitable for small DNA fragments less than 100 bp in length. DNA libraries were sequenced using next-generation sequencing methods. Sequencing data were used to identify cfDNA fragmentation patterns in genomic regions near TFBS using bioinformatics methods. This analysis involves determining the nucleosome positioning profile of cfDNA fragments on TFBS and their flanking sequences in gene promoter sequences to determine whether TFBS binds transcription factors in cfDNA-containing chromatin fragments. The method is complex, but can be summarized as follows:

如果在橫跨TFBS和基因組側翼序列的DNA序列中觀察到的cfDNA片段化模式顯示出大約200bp的周期性,則這與降解中DNA的更強的蛋白質結合保護(在核小體結合位置的中心)以及更弱的蛋白質結合保護(在DNA未結合和未受保護的核小體之間)的交替有關。在這種情況下,TFBS和側翼序列被推測為覆蓋在染色質片段中的核小體,其包含血漿樣品中的 cfDNA。If the cfDNA fragmentation pattern observed in the DNA sequence spanning the TFBS and the flanking sequences of the genome shows a periodicity of approximately 200 bp, this is consistent with a stronger protein-binding protection of the DNA from degradation (in the center of the nucleosome binding site ) and an alternation of weaker protein-bound protection (between DNA-unbound and unprotected nucleosomes). In this case, TFBS and flanking sequences were presumed to be nucleosomes covering the chromatin fragments that comprise cfDNA in plasma samples.

如果存在的cfDNA片段化模式顯示TFBS及其側翼序列的蛋白質結合保護,但沒有(或減弱的)核小體相關週期性,則這與TFBS及其側翼序列的轉錄調控蛋白結合有關。在這種情況下,TFBS被推測已與血漿樣品中包含cfDNA的染色質片段中的一或多種轉錄因子和/或其他調控蛋白結合。If there is a pattern of cfDNA fragmentation showing protection from protein binding of TFBS and its flanking sequences, but no (or attenuated) nucleosome-associated periodicity, this is associated with transcriptional regulatory protein binding of TFBS and its flanking sequences. In this case, TFBS was hypothesized to have bound to one or more transcription factors and/or other regulatory proteins in cfDNA-containing chromatin fragments in plasma samples.

在健康主體中,發現的cfDNA片段化模式通常與造血細胞的核酸酶可及位點實驗獲得的模式相關。因此,在cfDNA中與轉錄因子結合或被核小體覆蓋的TFBS序列與在造血細胞中表現或不表現的轉錄因子相關。在癌症患者中,該模式涉及細胞類型的混合,其中,TFBS可能是在癌細胞類型中與轉錄因子結合和在造血細胞類型中與核小體結合。然而,已經開發了片段組學生物訊息學方法,以將 ctDNA中存在的小轉錄因子保護的TFBS片段訊號與造血來源的cfDNA成分中存在的更大的疊加的核小體週期性訊號分開。片段組學分析指出,混合的模式包含cfDNA TFBS序列,這些序列是與不在造血細胞中表現但由癌組織表現的轉錄因子結合的。In healthy subjects, the patterns of cfDNA fragmentation found generally correlate with patterns obtained experimentally at nuclease-accessible sites of hematopoietic cells. Thus, TFBS sequences bound to transcription factors in cfDNA or covered by nucleosomes were associated with transcription factors expressed or not expressed in hematopoietic cells. In cancer patients, this pattern involves a mix of cell types, where TFBS likely binds transcription factors in cancer cell types and nucleosomes in hematopoietic cell types. However, fragment-omics bioinformatics approaches have been developed to separate the small transcription factor-protected TFBS fragment signal present in ctDNA from the larger, superimposed nucleosome periodic signal present in cfDNA components of hematopoietic origin. Fragomic analysis indicated that the mixed pattern contained cfDNA TFBS sequences bound to transcription factors not expressed in hematopoietic cells but expressed by cancerous tissues.

我們之前已經描述了針對含有特定表觀遺傳訊號的循環無細胞核小體的免疫測定測試,包括用於檢測癌症和其他疾病的特定轉譯後修飾、組蛋白同功異形體、修飾的核苷酸和非組蛋白染色質蛋白(如WO2005019826、WO2013030577、WO2013030579和WO2013084002中所述)。我們還描述了染色質片段的免疫測定測試,包括用於檢測癌症的轉錄因子結合 DNA(如 WO2017162755 中所述)。We have previously described immunoassay tests targeting circulating cell-free nucleosomes containing specific epigenetic signals, including specific post-translational modifications, histone isoforms, modified nucleotides and Non-histone chromatin proteins (as described in WO2005019826, WO2013030577, WO2013030579 and WO2013084002). We also describe immunoassay tests of chromatin fragments including transcription factor binding DNA for detection of cancer (as described in WO2017162755).

我們現在報告了用於分析cfDNA中循環無細胞TFBS DNA 序列的改良方法,從中去除了背景週期性核小體訊號。這些方法適用於體液樣品作為非侵入性或微創檢測,用於包括癌症、自身免疫性疾病和發炎性疾病在內的疾病。We now report an improved method for the analysis of circulating cell-free TFBS DNA sequences in cfDNA, from which background periodic nucleosomal signaling is removed. These methods are suitable for use in bodily fluid samples as non-invasive or minimally invasive assays for diseases including cancer, autoimmune and inflammatory diseases.

根據本發明之第一態樣,提供一種檢測獲自一人類或動物主體的一體液樣品中的一無細胞DNA染色質片段的方法,該無細胞DNA染色質片段包含一轉錄因子結合位點序列的全部或一部分,選擇性地包含側翼序列,該方法包括步驟: (i)使該體液樣品接觸與核小體結合的一結合劑;和 (ii)分析來自在步驟(i)中未與該結合劑結合的體液樣品中的DNA。 According to a first aspect of the present invention, there is provided a method for detecting a cell-free DNA chromatin fragment comprising a transcription factor binding site sequence in a body fluid sample obtained from a human or animal subject All or part of, optionally comprising flanking sequences, the method comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes; and (ii) analyzing DNA from the body fluid sample not bound to the binding agent in step (i).

根據本發明之第二態樣,提供一種檢測獲自一人類或動物主體的一體液樣品中的無細胞DNA染色質片段化模式的方法,包括步驟: (i)使該體液樣品接觸與核小體結合的一結合劑;和 (ii)分析來自在步驟(i)中未與該結合劑結合的體液樣品的DNA。 According to a second aspect of the present invention, there is provided a method for detecting a cell-free DNA chromatin fragmentation pattern in a body fluid sample obtained from a human or animal subject, comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes; and (ii) analyzing DNA from the body fluid sample not bound to the binding agent in step (i).

根據本發明之另一態樣,提供一種檢測一人類或動物主體疾病的方法,包括步驟: (i)使獲自該人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)分離步驟(i)中未與該結合劑結合的DNA; (iii)選擇性地擴增分離的DNA; (iv)確定DNA的序列;和 (v)使用在DNA中存在的一轉錄因子結合位點DNA序列和選擇性的側翼DNA序列作為一生物標記物以確定該主體中疾病的存在和/或性質。 According to another aspect of the present invention, there is provided a method of detecting a disease in a human or animal subject, comprising the steps of: (i) contacting a sample of bodily fluid obtained from the human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) selectively amplifying the isolated DNA; (iv) determine the sequence of the DNA; and (v) using a transcription factor binding site DNA sequence and optionally flanking DNA sequences present in the DNA as a biomarker to determine the presence and/or nature of the disease in the subject.

根據本發明之另一態樣,提供一種檢測一人類或動物主體疾病的方法,包括步驟: (i)使獲自該人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)分離步驟(i)中未與該結合劑結合的DNA; (iii)選擇性地擴增分離的DNA; (iv)檢測DNA;和 (v)使用在步驟(iv)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式作為該主體中疾病的存在和/或性質的指標。 According to another aspect of the present invention, there is provided a method of detecting a disease in a human or animal subject, comprising the steps of: (i) contacting a sample of bodily fluid obtained from the human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) selectively amplifying the isolated DNA; (iv) DNA testing; and (v) using the DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (iv) as an indicator of the presence and/or nature of the disease in the subject.

根據本發明之另一態樣,提供一種檢測人類或動物主體疾病的方法,包括以下步驟: (i)使獲自該人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)分離步驟(i)中未與該結合劑結合的DNA; (iii)使用雜交法檢測分離的DNA;和 (iv)使用雜交的DNA的存在或數量作為主體中疾病的存在和/或性質的指標。 According to another aspect of the present invention, there is provided a method for detecting a disease in a human or animal subject, comprising the following steps: (i) contacting a sample of bodily fluid obtained from the human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) detection of isolated DNA using hybridization methods; and (iv) using the presence or amount of hybridized DNA as an indicator of the presence and/or nature of disease in a subject.

根據本發明之另一態樣,提供一種檢測或診斷一動物或人類主體疾病的方法,包括步驟: (i)從獲自該主體的一體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iii)使用在步驟(ii)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式來識別該主體的疾病狀態。 According to another aspect of the present invention, there is provided a method for detecting or diagnosing a disease in an animal or human subject, comprising the steps of: (i) removing nucleosomes from a sample of body fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; and (iii) using the DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (ii) to identify a disease state in the subject.

根據本發明之另一態樣,提供一種評估一動物或人類主體是否適合進行醫學治療的方法,包括步驟: (i)從獲自該主體的一體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iii)使用在步驟(ii)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式作為替該主體選擇適合的治療的參數。 According to another aspect of the present invention, there is provided a method of assessing the suitability of an animal or human subject for medical treatment, comprising the steps of: (i) removing nucleosomes from a sample of body fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; and (iii) using the DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (ii) as parameters for selecting an appropriate treatment for the subject.

根據本發明之另一態樣,提供一種用於監測一動物或人類主體治療的方法,包括步驟: (i)從獲自該主體的一體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA; (iii)在一或多種時機下,重複從獲自該主體的體液樣品中去除核小體後檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iv)使用與步驟(ii)相比在步驟(iii)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式的任何變化作為該主體病況任何變化的參數。 According to another aspect of the invention, there is provided a method for monitoring treatment of an animal or human subject comprising the steps of: (i) removing nucleosomes from a sample of body fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; (iii) on one or more occasions, repeatedly detecting, analyzing, or measuring DNA associated with cell-free chromatin fragments in the remaining sample after removal of nucleosomes from a sample of bodily fluid obtained from the subject; and (iv) using any change in DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (iii) compared to step (ii) as a parameter for any change in the subject's condition.

根據本發明之另一態樣,提供一種用於檢測cfDNA片段序列的試劑盒,包括一核小體結合劑和用於與所述cfDNA序列相連的DNA的擴增和/或定序和/或片段化模式,選擇性地還有在本文所述之方法中的試劑盒的使用說明書。According to another aspect of the present invention, a kit for detecting cfDNA fragment sequence is provided, including a nucleosome binding agent and used for amplification and/or sequencing and/or Fragmentation formats, optionally also instructions for use of the kits in the methods described herein.

根據本發明之另一態樣,提供一種治療所需主體中疾病的方法,其中,所述方法包括以下步驟: (i)使獲自人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)檢測或測量未與步驟(i)中的該結合劑結合的DNA片段; (iii)使用DNA片段的存在、序列、數量或片段化模式作為該主體中疾病存在的指標;和 (iv)如果在步驟(iii)中確定該主體患有疾病,則給予治療。 According to another aspect of the present invention, there is provided a method of treating a disease in a subject in need thereof, wherein said method comprises the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) detecting or measuring DNA fragments not bound to the binding agent in step (i); (iii) using the presence, sequence, amount, or pattern of fragmentation of DNA fragments as an indicator of the presence of disease in the subject; and (iv) administering treatment if the subject is determined to have the disease in step (iii).

根據本發明之另一態樣,提供一種在取自一懷孕的人類或動物主體的體液樣品中檢測胎兒疾病狀態的方法,包括步驟: (i)使母體體液樣品接觸與核小體結合的一結合劑; (ii)分析步驟(i)中未與該結合劑結合的DNA;和 (iii)使用該DNA的存在、數量、序列和/或片段化模式作為該主體的胎兒疾病狀態的指標。 According to another aspect of the present invention, there is provided a method for detecting a fetal disease state in a body fluid sample taken from a pregnant human or animal subject, comprising the steps of: (i) exposing the sample of maternal body fluid to a binding agent that binds to nucleosomes; (ii) analyzing DNA not bound to the binding agent in step (i); and (iii) using the presence, amount, sequence and/or fragmentation pattern of the DNA as an indicator of the subject's fetal disease state.

轉錄因子與癌症有關,約佔所有已知癌基因的20%(Lambert et al, 2018)。我們之前已經描述了使用含有組織特異性轉錄因子的染色質片段作為血清中的生物標記物來檢測或診斷主體的癌症。轉錄因子的組織特異性可用於指示癌症起源的組織。例如,據報導轉錄因子TTF-1在甲狀腺和肺組織中表現,而不在其他組織中表現。因此,含有TTF-1的循環染色質片段的存在表示起源組織是肺或甲狀腺。我們還描述了用於測量含有轉錄因子的循環無細胞染色質片段的免疫測定方法。這種免疫測定涉及雙抗體(或其他結合劑)方法,其中一種結合劑與轉錄因子結合,而另一種結合劑結合到與轉錄因子相連的DNA或結合到包含在染色質片段中的核小體成分。在所述的一個實施例中,將靶向結合轉錄因子的結合劑固定在固相上以分離含有轉錄因子的染色質片段(即免疫沉澱染色質片段)。然後使用與DNA結合的第二種結合劑檢測分離的染色質片段。這種免疫測定方法簡單、成本低且非侵入性的。我們現在報告使用改良的cfDNA分析方法來檢測疾病。該方法的原理涉及在分析與剩餘染色質片段相連的cfDNA片段之前,從體液樣品中去除含有核小體的染色質片段。透過這種方式,從樣品中去除cfDNA片段化模式的核小體成分,留下不包括核小體的小cfDNA片段。去除核小體後cfDNA中存在的TFBS序列指出該序列透過與所討論的轉錄因子和/或其他調節蛋白結合而受到保護(並且未與核小體結合)。這種用於TFBS譜(profile)分析的方法無需識別cfDNA片段端點和/或其基因組位置和/或用於解開混合核小體和轉錄因子結合的片段組學訊號的複雜生物訊息學方法,並促進了以前不可能的ctDNA測試方法。 Transcription factors are implicated in cancer, accounting for about 20% of all known oncogenes (Lambert et al , 2018). We have previously described the use of chromatin fragments containing tissue-specific transcription factors as biomarkers in serum to detect or diagnose cancer in subjects. Tissue specificity of transcription factors can be used to indicate the tissue of cancer origin. For example, the transcription factor TTF-1 has been reported to be expressed in thyroid and lung tissues, but not in other tissues. Thus, the presence of circulating chromatin fragments containing TTF-1 indicates that the tissue of origin is lung or thyroid. We also describe an immunoassay for the measurement of circulating cell-free chromatin fragments containing transcription factors. This immunoassay involves a double antibody (or other binding agent) approach, in which one binding agent binds to the transcription factor and the other binds to DNA associated with the transcription factor or to nucleosomes contained in chromatin fragments Element. In one embodiment, the binding agent targeting the transcription factor is immobilized on a solid phase to separate the chromatin fragment containing the transcription factor (ie, immunoprecipitate the chromatin fragment). The isolated chromatin fragments are then detected using a second binder that binds to the DNA. This immunoassay method is simple, low cost and non-invasive. We now report the use of a modified cfDNA assay to detect the disease. The principle of the method involves removing nucleosome-containing chromatin fragments from bodily fluid samples prior to analysis of cfDNA fragments associated with remaining chromatin fragments. In this way, the nucleosomal component of the cfDNA fragmentation pattern is removed from the sample, leaving behind small cfDNA fragments that do not include nucleosomes. The presence of TFBS sequences in cfDNA after nucleosome removal indicates that this sequence is protected (and not bound to nucleosomes) by binding to the transcription factor and/or other regulatory proteins in question. This method for TFBS profile analysis does not require the identification of cfDNA fragment endpoints and/or their genomic locations and/or complex bioinformatics methods for unraveling fragmentomics signals of mixed nucleosome and transcription factor binding , and facilitated previously impossible ctDNA testing methods.

樣品的總cfDNA片段化模式是由樣品中存在的所有染色質片段形成,包括那些包含或不包含核小體的染色質片段。本發明中主要感興趣的染色質片段是不含核小體的那些染色質片段。因此,本發明主要關注的是非核小體cfDNA片段。The total cfDNA fragmentation pattern of a sample is formed by all chromatin fragments present in the sample, including those that contain or do not contain nucleosomes. The chromatin fragments of primary interest in the present invention are those that do not contain nucleosomes. Therefore, the present invention is primarily concerned with non-nucleosomal cfDNA fragments.

本發明的基本原理涉及在去除核小體後檢測與樣品中的調控蛋白結合的cfDNA調控序列,例如與轉錄因子結合的TFBS序列。TFBS可與在患病組織的細胞中以升高的含量表現的轉錄因子結合,但在與核小體結合的造血組織中不與轉錄因子結合。因此,含有這種被轉錄因子結合的TFBS序列的染色質片段很可能源自患病組織中與活性基因相關的細胞。另一方面,相同的TFBS序列將與造血起源(基因在其中為無活性的)的染色質片段中的核小體結合。因此,從樣品中去除與核小體結合的cfDNA片段,將轉錄因子佔據的TFBS cfDNA片段留在原處。剩餘cfDNA中TFBS序列(選擇性地有側翼序列)的存在或數量足以確定TFBS是樣品中轉錄因子結合的,無需識別片段端點序列或其基因組位置或對於核小體結合強度週期性的複雜測定和解釋。此外,去除大部分總染色質片段意味著由於背景較低,可以更容易地在剩餘的cfDNA中檢測到TFBS序列(選擇性地有側翼序列)。The basic principle of the present invention involves the detection of cfDNA regulatory sequences bound to regulatory proteins in a sample, such as TFBS sequences bound to transcription factors, after nucleosome removal. TFBS can bind transcription factors expressed at elevated levels in cells of diseased tissues, but not in hematopoietic tissues bound to nucleosomes. Thus, chromatin fragments containing such TFBS sequences bound by transcription factors likely originated from cells associated with active genes in diseased tissues. On the other hand, the same TFBS sequence will bind to nucleosomes in chromatin fragments of hematopoietic origin where the gene is inactive. Thus, cfDNA fragments bound to nucleosomes were removed from the sample, leaving transcription factor-occupied TFBS cfDNA fragments in place. The presence or amount of TFBS sequences (optionally with flanking sequences) in the remaining cfDNA is sufficient to establish that TFBS is transcription factor-bound in the sample, without the need for identification of fragment endpoint sequences or their genomic location or complex determination of the periodicity of nucleosome binding strength and explain. Furthermore, removal of most of the total chromatin fragments means that TFBS sequences (optionally with flanking sequences) can be more easily detected in the remaining cfDNA due to lower background.

該方法在DNA分析之前去除了基因組範圍內位於所有位置的健康造血細胞來源的核小體,因此也去除了它們的核小體產生的周期性cfDNA片段化模式。去除核小體後,剩餘的cfDNA片段將為在患病細胞中包含非組蛋白結合的序列,例如由一或多轉錄因子結合的TFBS序列。這是有用處的,因為在癌症細胞及其他患病細胞中表現的許多轉錄因子,在造血細胞中不表現,以及去除核小體之前在cfDNA中它們的結合序列的存在,指示了疾病起源的cfDNA組織或細胞。舉例而言,如果選擇了在癌細胞中表現但在造血細胞中不表現的轉錄因子和相應的轉錄因子結合位點,則在患者樣品中檢測到的任何包含全部或部分TFBS序列(選擇性地還有側翼序列)為患者中癌症疾病存在的指標(因為衍生自健康造血細胞的含有全部或部分的相同TFBS和側翼序列已被核小體覆蓋且被去除)。The method removes healthy hematopoietic cell-derived nucleosomes at all positions genome-wide prior to DNA analysis, and therefore removes the periodic cfDNA fragmentation patterns produced by their nucleosomes. After removal of nucleosomes, the remaining cfDNA fragments will contain non-histone-bound sequences in diseased cells, such as TFBS sequences bound by one or more transcription factors. This is useful because many transcription factors expressed in cancer cells and other diseased cells are not expressed in hematopoietic cells, and the presence of their binding sequences in cfDNA prior to nucleosome removal is indicative of disease origin. cfDNA tissue or cells. For example, if transcription factors and corresponding transcription factor binding sites that are expressed in cancer cells but not in hematopoietic cells are selected, any TFBS detected in a patient sample that contains all or part of the TFBS sequence (optionally Also flanking sequences) are indicators of the presence of cancerous disease in patients (since those derived from healthy hematopoietic cells containing all or part of the same TFBS and flanking sequences have been covered by nucleosomes and removed).

該方法具有以下優點:(i)對轉錄因子結合的cfDNA片段的檢測具有更高的分析敏感度,(ii)對源自疾病的cfDNA片段化模式具有更高的分析敏感度,(iii)避免了對來自混合細胞起源的混合訊號的複雜生物訊息分析,(iv)去除大部分定序需求(對於除去的核小體),這使得該方法更適合日常臨床使用,例如透過使用PCR引子擴增TFBS序列而不是用次世代全基因組定序,(v)降低定序成本,重要的是(vi)提高該方法的臨床準確性和實用性。This method has the following advantages: (i) higher analytical sensitivity for detection of transcription factor-bound cfDNA fragments, (ii) higher analytical sensitivity for disease-derived cfDNA fragmentation patterns, (iii) avoidance of enables complex bioinformatics analysis of mixed signals from mixed cellular origins, (iv) removes most of the sequencing requirement (for removed nucleosomes), which makes the method more suitable for routine clinical use, e.g. through the use of PCR primer amplification TFBS-seq instead of next-generation whole-genome sequencing, (v) reduces sequencing costs and importantly (vi) increases the clinical accuracy and utility of the method.

本發明的方法涉及在鑑定剩餘cfDNA中的TFBS序列之前分離或去除核小體結合的cfDNA片段。這是透過在萃取和/或擴增和/或cfDNA定序之前對體液樣品中的所有或大部分核小體進行免疫沉澱來實現的。免疫沉澱可以使用任何核小體結合劑來達成,包括抗-核小體抗體或其他核小體結合劑,例如WO2021038010中描述的那些。The methods of the present invention involve the isolation or removal of nucleosome-bound cfDNA fragments prior to the identification of TFBS sequences in the remaining cfDNA. This is achieved by immunoprecipitation of all or a majority of nucleosomes in bodily fluid samples prior to extraction and/or amplification and/or cfDNA sequencing. Immunoprecipitation can be achieved using any nucleosome binding agent, including anti-nucleosome antibodies or other nucleosome binding agents such as those described in WO2021038010.

我們開發了免疫沉澱方法,在萃取和/或擴增和/或定序樣品中剩餘的cfDNA之前,從體液樣品中去除所有或大部分核小體(健康和患病細胞來源)。可以透過使用與核小體本身或所有核小體或大多數核小體結合的抗-核小體抗體來實現免疫沉澱。我們開發了這種分離方法,其涉及了抗-核小體抗體連接的磁珠,並顯示了從血漿樣品中定量去除核小體的情況。We have developed immunoprecipitation methods to remove all or most nucleosomes (both healthy and diseased cell sources) from bodily fluid samples prior to extraction and/or amplification and/or sequencing of the remaining cfDNA in the sample. Immunoprecipitation can be achieved by using anti-nucleosome antibodies that bind to the nucleosome itself or to all or most nucleosomes. We developed this isolation method, which involves anti-nucleosome antibody-linked magnetic beads, and showed the quantitative removal of nucleosomes from plasma samples.

因此,根據本發明的第一態樣,提供了一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA染色質片段的方法,該無細胞DNA染色質片段包含轉錄因子結合位點(TFBS)(或其他非組蛋白結合位點)序列的全部或一部分,選擇性地包含側翼序列,該方法包括步驟: (i)使體液樣品接觸與核小體結合的結合劑;和 (ii)分析在步驟(i)中未與結合劑結合的體液樣品中的DNA。 Thus, according to a first aspect of the present invention, there is provided a method of detecting a cell-free DNA chromatin fragment comprising a transcription factor binding site (TFBS) in a body fluid sample obtained from a human or animal subject. ) (or other non-histone binding site) sequence, optionally including flanking sequences, the method comprising the steps of: (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; and (ii) analyzing DNA in the body fluid sample not bound to the binding agent in step (i).

在本發明的第二態樣中,可分析取自主體的體液樣品中的cfDNA片段化模式,例如用來檢測疾病和識別受影響的細胞或組織。預先從樣品中去除核小體有助於藉由消除核小體片段化模式的干擾來分析活性轉錄因子結合位點周圍的cfDNA片段化模式。因此,根據本發明之第二態樣,提供一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA染色質片段化模式的方法,包括步驟: (i)使該體液樣品接觸與核小體結合的一結合劑;和 (ii)分析來自在步驟(i)中未與該結合劑結合的體液樣品的DNA。 In a second aspect of the invention, cfDNA fragmentation patterns in bodily fluid samples taken from a subject can be analyzed, for example to detect disease and identify affected cells or tissues. Preliminary removal of nucleosomes from samples facilitates the analysis of cfDNA fragmentation patterns around active transcription factor binding sites by eliminating the interference of nucleosome fragmentation patterns. Therefore, according to a second aspect of the present invention, there is provided a method for detecting a cell-free DNA chromatin fragmentation pattern in a body fluid sample obtained from a human or animal subject, comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes; and (ii) analyzing DNA from the body fluid sample not bound to the binding agent in step (i).

在一實施例中,可將檢測到的染色質片段化模式與已知的DNA/染色質片段化模式(即參考片段化模式)比較,例如透過生物訊息學方法。已知的參考片段化模式可透過對於已知組織或癌症類型的細胞進行Nuclease Accessible Site分析而產生。該比較可用於確定cfDNA的組織來源。In one embodiment, the detected chromatin fragmentation pattern can be compared to known DNA/chromatin fragmentation patterns (ie reference fragmentation patterns), for example by bioinformatics methods. Known reference fragmentation patterns can be generated by performing Nuclease Accessible Site analysis on cells from known tissues or cancer types. This comparison can be used to determine the tissue source of cfDNA.

在另一實施例中,可比較(例如透過生物訊息學方法)檢測到的染色質片段化模式和已知的DNA/染色質片段化模式,該片段化模式係產自先前對患有已知疾病狀態的病患(例如健康患者或患有已知癌症疾病的患者)的研究。該比較可用於確定主體的疾病狀態。In another embodiment, detected chromatin fragmentation patterns can be compared (e.g., by bioinformatics methods) to known DNA/chromatin fragmentation patterns generated from previous studies of patients with known The study of patients in a disease state such as healthy patients or patients with known cancer disease. This comparison can be used to determine the disease state of the subject.

因此,在本發明的另一態樣,提供了體液中的cfDNA片段,其不與核小體結合,其具有TFBS序列,選擇性地包括側翼序列,作為疾病的生物標記物。Therefore, in another aspect of the present invention, cfDNA fragments in body fluids, which are not bound to nucleosomes, having a TFBS sequence, optionally including flanking sequences, are provided as biomarkers for disease.

在一實施例中,提供了體液中不與核小體結合的多種cfDNA片段,其包含TFBS序列的組合或模式,任選地包含側翼序列,一起用作為疾病的生物標記物.In one embodiment, a plurality of cfDNA fragments in bodily fluids that do not bind to nucleosomes are provided that comprise combinations or patterns of TFBS sequences, optionally including flanking sequences, for use together as biomarkers for disease.

本領域技術人員將清楚的是,去除源自健康和/或造血細胞或組織的核小體對於本發明之目的可能足夠。本領域已知來自患病或胎兒細胞或組織的無細胞核小體與長度約為147bp的DNA片段相連。這些核小體不包含連接子DNA。相比之下,來自健康和/或造血細胞或組織的無細胞核小體與大約167bp的較長DNA片段大小相連,其中確實包含連接子DNA。令人驚訝的是,可以實現與較長DNA片段大小(包含連接子DNA)相連的無細胞核小體的分離。我們之前已經透過使用核小體結合劑證明了這一點,該結合劑與含有連接子DNA的核小體結合(相連的cfDNA片段大小約為167bp),但不與不含連接子DNA的無細胞核小體結合(相連的 cfDNA片段大小為約147bp)。這些結合劑可用於免疫沉澱含有167bp cfDNA片段的健康細胞來源的核小體,同時留下溶液中與大小約為147bp的較小DNA片段相連的患病或胎兒來源的核小體,其不含連接子DNA(如WO2021038010所述)。It will be clear to those skilled in the art that removal of nucleosomes derived from healthy and/or hematopoietic cells or tissues may be sufficient for the purposes of the present invention. It is known in the art that cell-free nucleosomes from diseased or fetal cells or tissues are associated with DNA fragments approximately 147 bp in length. These nucleosomes do not contain linker DNA. In contrast, cell-free nucleosomes from healthy and/or hematopoietic cells or tissues are linked to a longer DNA fragment size of approximately 167 bp, which does contain linker DNA. Surprisingly, isolation of cell-free nucleosomes linked to longer DNA fragment sizes (containing linker DNA) can be achieved. We have demonstrated this previously by using a nucleosome binder that binds to nucleosomes containing linker DNA (the size of the linked cfDNA fragment is approximately 167 bp), but not to nucleosomes that do not contain linker DNA. Body binding (the size of the connected cfDNA fragment is about 147bp). These binders can be used to immunoprecipitate healthy cell-derived nucleosomes containing a 167bp cfDNA fragment, while leaving diseased or fetal-derived nucleosomes in solution linked to smaller DNA fragments approximately 147bp in size, free of Linker DNA (as described in WO2021038010).

因此,在一實施例中,結合劑與含有連接子DNA的核小體結合。Thus, in one embodiment, the binding agent binds to a nucleosome containing linker DNA.

在一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中包含全部或部分的TFBS序列(或其他非組蛋白結合位點)選擇性地包含側翼序列的無細胞DNA片段,包括步驟: (i)使體液樣品接觸與含有連接子DNA的核小體結合的一結合劑;和 (ii)分析在步驟(i)中體液樣品中未與結合劑結合的DNA。 In one embodiment, there is provided a method for detecting cell-free DNA fragments comprising all or part of a TFBS sequence (or other non-histone binding sites) optionally comprising flanking sequences in a body fluid sample obtained from a human or animal subject, including step: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes containing linker DNA; and (ii) analyzing the DNA in the bodily fluid sample not bound to the binding agent in step (i).

在另一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA片段化模式的方法,包括步驟: (i)使體液樣品接觸與含有連接子DNA的核小體結合的一結合劑;和 (ii)分析在步驟(i)中體液樣品中未與結合劑結合的DNA。 In another embodiment, there is provided a method of detecting a pattern of cell-free DNA fragmentation in a bodily fluid sample obtained from a human or animal subject, comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes containing linker DNA; and (ii) analyzing the DNA in the bodily fluid sample not bound to the binding agent in step (i).

在較佳實施例中,與含有連接子DNA的核小體結合的結合劑是組蛋白H1部分(moiety)或染色質結合蛋白的全部或一部分,包含但不限於:染色質解旋酶DNA結合 (Chromodomain Helicase DNA Binding,CHD) 蛋白、DNA (胞嘧啶-5)-甲基轉移酶(DNMT) 蛋白、高遷移率族或高遷移率族框蛋白(HMG或HMGB)、聚[ADP-核糖]聚合酶(PARP)蛋白、或含有甲基-CpG結合域(MBD)的蛋白質,例如MECP2。在一實施例中,結合劑與組蛋白H1或其組分結合。在另一較佳實施例中,結合劑附著於固相支持物或沉澱,使得結合的核小體可以從樣品中去除(即,收集未與結合劑結合的樣品並分析相連的DNA,如本文所述)。In preferred embodiments, the binding agent that binds to nucleosomes containing linker DNA is all or part of a histone H1 moiety or a chromatin binding protein, including but not limited to: chromatin helicase DNA binding (Chromodomain Helicase DNA Binding, CHD) protein, DNA (cytosine-5)-methyltransferase (DNMT) protein, high mobility group or high mobility group box protein (HMG or HMGB), poly[ADP-ribose] Polymerase (PARP) proteins, or proteins containing a methyl-CpG binding domain (MBD), such as MECP2. In one embodiment, the binding agent binds to histone HI or a component thereof. In another preferred embodiment, the binding agent is attached to a solid support or pellet such that bound nucleosomes can be removed from the sample (i.e., a sample not bound to the binding agent is collected and the associated DNA analyzed, as described herein described).

如上所述,本發明基於除去核小體後cfDNA中序列的存在,有利於鑑定樣品中調控蛋白結合的調控DNA序列。因此,根據本發明一實施例,提供了一種檢測獲自人類或動物主體的體液樣品中無細胞DNA中與調控蛋白結合的調控DNA序列(選擇性地包含側翼序列)的方法,包括步驟: (i)使體液樣品接觸與核小體結合的結合劑接觸;和 (ii)分析步驟(i)中未與結合劑結合的DNA,以檢測調控序列(選擇性地包括側翼序列)。 As mentioned above, the present invention facilitates the identification of regulatory DNA sequences that regulate protein binding in a sample based on the presence of sequences in cfDNA after nucleosome removal. Therefore, according to an embodiment of the present invention, there is provided a method for detecting a regulatory DNA sequence (optionally including flanking sequences) bound to a regulatory protein in cell-free DNA obtained from a body fluid sample of a human or animal subject, comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes; and (ii) Analyzing the DNA in step (i) that has not been bound by the binding agent to detect regulatory sequences (optionally including flanking sequences).

DNA分析方法可能涉及DNA分離和擴增。因此,在一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA染色質片段化模式的方法,包括步驟: (i)使體液樣品接觸與核小體結合的結合劑; (ii)從未與步驟(i)中的結合劑結合的體液樣品中萃取DNA;和 (iii)分析萃取的DNA以檢測染色質片段化模式。 DNA analysis methods may involve DNA isolation and amplification. Accordingly, in one embodiment, there is provided a method of detecting chromatin fragmentation patterns in cell-free DNA in a bodily fluid sample obtained from a human or animal subject, comprising the steps of: (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; (ii) extracting DNA from the body fluid sample that has not been bound to the binding agent in step (i); and (iii) Analysis of extracted DNA to detect chromatin fragmentation patterns.

在一實施例中,相連的DNA分析涉及鑑定cfDNA片段的存在,所述cfDNA片段包含轉錄因子結合位點(TFBS)序列和/或側翼序列。在另一較佳實施例中,結合劑附著在固相支持物上或沉澱,使得它和與它附著的核小體可從樣品中去除。In one embodiment, linked DNA analysis involves identifying the presence of cfDNA fragments comprising transcription factor binding site (TFBS) sequences and/or flanking sequences. In another preferred embodiment, the binding agent is attached to a solid support or precipitated such that it and its attached nucleosomes can be removed from the sample.

可以透過本領域已知的任何方法分析核小體耗盡的cfDNA樣品中的DNA序列。在較佳實施例中,透過PCR方法,擴增藉由轉接子寡核苷酸與DNA片段的連接所產生的cfDNA文庫。轉接子寡核苷酸可包括引子序列以利透過PCR擴增文庫。The DNA sequence in the nucleosome-depleted cfDNA sample can be analyzed by any method known in the art. In a preferred embodiment, the cfDNA library generated by ligation of adapter oligonucleotides and DNA fragments is amplified by PCR method. The adapter oligonucleotides may include primer sequences to facilitate amplification of the library by PCR.

因此,在本發明一實施例中,提供一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA染色質片段的方法,該無細胞DNA染色質片段包含TFBS(或其他非組蛋白結合位點)序列的全部或一部分,選擇性地包含側翼序列,該方法包括步驟: (i)使體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA片段; (iii)將轉接子寡核苷酸連接到步驟(ii)中分離的DNA片段上; (iv)擴增DNA片段;和 (v)檢測擴增的DNA中的全部或部分TFBS(或其他非組蛋白結合位點)序列,選擇性地包括側翼序列。 Accordingly, in one embodiment of the present invention, there is provided a method of detecting cell-free DNA chromatin fragments comprising TFBS (or other non-histone binding sites) in a body fluid sample obtained from a human or animal subject point) all or part of the sequence, optionally including flanking sequences, the method comprises the steps of: (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; (ii) isolating DNA fragments not bound to the binding agent in step (i); (iii) ligating the adapter oligonucleotides to the DNA fragments isolated in step (ii); (iv) amplified DNA fragments; and (v) Detection of all or part of a TFBS (or other non-histone binding site) sequence, optionally including flanking sequences, in amplified DNA.

在本發明另一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA片段化模式的方法,包括步驟; (i)使體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA片段; (iii)將轉接子寡核苷酸連接到步驟(ii)中分離的DNA片段上; (iv)擴增該DNA片段; (v)定序該DNA 片段;和 (vi)檢測無細胞DNA片段化模式。 In another embodiment of the present invention, there is provided a method for detecting a cell-free DNA fragmentation pattern in a bodily fluid sample obtained from a human or animal subject, comprising the steps of; (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; (ii) isolating DNA fragments not bound to the binding agent in step (i); (iii) ligating the adapter oligonucleotides to the DNA fragments isolated in step (ii); (iv) amplifying the DNA fragment; (v) sequence the DNA fragment; and (vi) Detection of cell-free DNA fragmentation patterns.

在其他實施例中,PCR引子係用於DNA擴增。可設計簡併引子以擴增在步驟(ii)中分離的所有DNA序列,或可使用本領域已知的軟體來設計特異性引子以擴增與轉錄因子的TFBS相關的特定DNA序列,選擇性地還包括側翼區。使用特異性序列引子意味著可以分析cfDNA的任何特定TFBS序列,選擇性地包含側翼序列,而不需對整個cfDNA文庫進行定序。In other embodiments, PCR primers are used for DNA amplification. Degenerate primers can be designed to amplify all DNA sequences isolated in step (ii), or specific primers can be designed using software known in the art to amplify specific DNA sequences associated with the TFBS of transcription factors, selectively The land also includes flanking areas. The use of sequence-specific primers means that cfDNA can be analyzed for any specific TFBS sequence, optionally including flanking sequences, without sequencing the entire cfDNA library.

因此,在本發明一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中的體內無細胞DNA片段的方法,該DNA片段包括全部或部分TFBS(或其他非組蛋白結合位點)序列,選擇性地包括側翼序列,包括步驟; (i)使體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)使用序列特異性引子藉由PCR方法擴增分離的DNA; (iv)檢測擴增的DNA;和 (v)使用擴增的DNA的存在或數量作為樣品中cfDNA片段存在的指標,該片段包括全部或部分 TFBS(或其他非組蛋白結合位點)序列,選擇性地包括側翼序列。 Accordingly, in one embodiment of the present invention, a method is provided for detecting in vivo cell-free DNA fragments comprising all or part of TFBS (or other non-histone binding sites) in a body fluid sample obtained from a human or animal subject ) sequence, optionally including flanking sequences, comprising the steps; (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) amplifying the isolated DNA by PCR method using sequence-specific primers; (iv) detection of amplified DNA; and (v) Use the presence or amount of amplified DNA as an indicator of the presence in the sample of cfDNA fragments that include all or part of the TFBS (or other non-histone binding site) sequence, optionally including flanking sequences.

鑑定所選序列的DNA片段的常用方法是透過DNA與互補DNA序列的雜交。因此,在本發明的另一態樣,提供了一種檢測獲自人類或動物主體的體液樣品中的無細胞DNA片段的方法,該無細胞DNA片段包含全部或部分TFBS(或其他非組蛋白結合位點)序列,選擇性地包含側翼序列,該方法包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)選擇性地擴增在步驟(ii)中分離的DNA; (iii)用雜交法檢測DNA;和 (iv)使用DNA雜交的存在或數量作為樣品中cfDNA片段存在的指標,該cfDNA片段包括全部或部分TFBS(或其他非組蛋白結合位點)序列,任選地包括側翼序列。 A common method of identifying DNA fragments of a selected sequence is through hybridization of the DNA to a complementary DNA sequence. Accordingly, in another aspect of the invention, there is provided a method of detecting cell-free DNA fragments comprising all or part of TFBS (or other non-histone bound proteins) in a body fluid sample obtained from a human or animal subject. site) sequences, optionally comprising flanking sequences, the method comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) selectively amplifying the DNA isolated in step (ii); (iii) detection of DNA by hybridization; and (iv) Use the presence or amount of DNA hybridization as an indicator of the presence in the sample of cfDNA fragments that include all or part of the TFBS (or other non-histone binding site) sequence, optionally including flanking sequences.

本發明還提供了一種透過在分析cfDNA之前去除核小體cfDNA以富集或純化體液樣品中cfDNA中受轉錄因子保護的TFBS序列的方法。The present invention also provides a method for enriching or purifying TFBS sequences protected by transcription factors in cfDNA in body fluid samples by removing nucleosomal cfDNA prior to cfDNA analysis.

在本發明一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中受轉錄因子(或其他非組蛋白)保護的cfDNA序列和/或側翼序列的方法,包括步驟: (i)使體液樣品接觸與核小體結合的結合劑;和 (ii) 分析步驟(i)中未與結合劑結合的cfDNA片段中存在於TFBS(或其他非組蛋白結合序列)和/或側翼序列中的 DNA序列的存在。 In one embodiment of the present invention, there is provided a method for detecting cfDNA sequences and/or flanking sequences protected by transcription factors (or other non-histone proteins) in body fluid samples obtained from human or animal subjects, comprising the steps of: (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; and (ii) Analyzing the cfDNA fragments in step (i) not bound by the binding agent for the presence of DNA sequences present in the TFBS (or other non-histone-binding sequences) and/or flanking sequences.

應當理解的是,與染色質中的DNA結合的任何非組蛋白可能適用於本發明的方法,包括轉錄因子以及其他非組蛋白染色質,包括染色質修飾蛋白、遺傳和表觀遺傳讀取、寫入和刪除蛋白質、參與RNA轉錄的蛋白質(例如RNA聚合酶分子)和建築(architectural)或結構(structural)染色質蛋白質(例如DNA彎曲蛋白質)。It should be understood that any non-histone protein that binds to DNA in chromatin may be suitable for use in the methods of the invention, including transcription factors as well as other non-histone chromatin, including chromatin modifying proteins, genetic and epigenetic readouts, Write and delete proteins, proteins involved in RNA transcription (such as RNA polymerase molecules), and architectural or structural chromatin proteins (such as DNA bending proteins).

在本發明一實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中受非組蛋白保護的DNA序列的方法,包括步驟: (i)使體液樣品接觸與核小體結合的結合劑;和 (ii)分析、測量或定序未與步驟(i)中的結合劑結合的cfDNA片段。 In one embodiment of the present invention, a method for detecting DNA sequences protected by non-histone proteins in body fluid samples obtained from human or animal subjects is provided, comprising the steps of: (i) exposing the bodily fluid sample to a binding agent that binds to nucleosomes; and (ii) analyzing, measuring or sequencing cfDNA fragments not bound to the binding agent in step (i).

在較佳實施例中,結合劑是與核小體或其組分結合的抗體或是核小體的染色質蛋白結合劑。In preferred embodiments, the binding agent is an antibody that binds to a nucleosome or a component thereof or a chromatin protein binding agent to a nucleosome.

在較佳實施例中,結合劑直接或間接地(例如,透過鍊黴親和素/生物素的連接子系統)附著在固相上,固相諸如塑性、磁性塑性、葡聚醣、瓊脂糖或本領域已知的其他固相支持物。在其他實施例中,結合劑以液體形式添加並透過交聯和使用聚乙二醇(PEG)以沉澱結合的核小體來分離,然後可以將其分離為固相沉澱物(例如透過離心或過濾)。許多免疫沉澱方法是本領域已知的,並且任何此類方法都可用於本發明的方法。In preferred embodiments, the binding agent is attached directly or indirectly (e.g., via a streptavidin/biotin linker system) to a solid phase such as plastic, magnetoplastic, dextran, agarose, or Other solid supports known in the art. In other embodiments, the binding agent is added in liquid form and isolated by cross-linking and the use of polyethylene glycol (PEG) to precipitate the bound nucleosomes, which can then be isolated as a solid phase precipitate (e.g., by centrifugation or filter). Many immunoprecipitation methods are known in the art, and any such method can be used in the methods of the invention.

與文獻中描述的先前方法相比,本發明的方法透過減少在TFBS序列和側翼序列處或附近的cfDNA片段化模式的檢測的競爭性背景訊號,提高了對轉錄因子佔據的TFBS序列的分析敏感度。這是因為當被源自健康造血細胞的核小體片段化模式掩蓋時,可能很難檢測到靠近TFBS序列的疾病衍生cfDNA片段化模式。分析敏感度的改進很重要,因為包含TFBS序列的一些循環cfDNA片段可能以低含量出現,接近或低於片段終點分析和本領域已知的其他方法的檢測極限。Compared to previous methods described in the literature, the method of the present invention increases the sensitivity of the analysis of TFBS sequences occupied by transcription factors by reducing the competing background signal for the detection of cfDNA fragmentation patterns at or near the TFBS sequences and flanking sequences Spend. This is because disease-derived cfDNA fragmentation patterns close to TFBS sequences may be difficult to detect when masked by nucleosome fragmentation patterns derived from healthy hematopoietic cells. Improvements in assay sensitivity are important because some circulating cfDNA fragments containing TFBS sequences may occur at low levels, near or below the detection limit of fragment endpoint assays and other methods known in the art.

本發明的方法還透過改良的方法檢測cfDNA轉錄因子在TFBS序列和側翼序列處或附近的佔據度,從而提供了優於文獻中描述之先前方法的改善的cfDNA組織起源特異性,其係透過兩種方法;(i)透過促進同時多個TFBS分析,以及(ii)因為單個轉錄因子可透過結合不同細胞中的基因組中的不同基因啟動子中的不同DNA序列來調控不同的基因。因此,cfDNA中 TFBS及其側翼序列的存在指示了起源的細胞類型,例如轉錄因子TTF-1、還有不同輔因子和其他轉錄因子結合不同組織的不同基因的不同啟動子序列,如圖1所示。The methods of the present invention also provide improved tissue-of-origin specificity for cfDNA over previous methods described in the literature by detecting the occupancy of cfDNA transcription factors at or near the TFBS sequence and flanking sequences through an improved method through two (i) by facilitating simultaneous multiple TFBS analysis, and (ii) because a single transcription factor can regulate different genes by binding different DNA sequences in different gene promoters in the genome in different cells. Thus, the presence of TFBS and its flanking sequences in cfDNA indicates the cell type of origin, such as the transcription factor TTF-1, but also the different promoter sequences of different genes with different cofactors and other transcription factors binding to different tissues, as shown in Fig. 1 Show.

基因表現透過轉錄因子與短TFBS DNA序列的特異性結合來調控,也稱為反應單元或結合模序(binding motif)。結合位點通常但不一定位於受調控基因的轉錄起始位點附近的基因啟動子區域。轉錄因子透過DNA結合域(DNA Binding Domain,DBD)以序列特異性方式與DNA結合。通常,TFBS序列在其靶基因的啟動子內長度為5-15bp,轉錄因子蛋白通常可以以不同程度的結合親和力與一組相似的DNA序列結合。與含有轉錄因子的循環染色質片段相連的DNA片段的長度將根據該片段是否還包括由其他轉錄因子、輔因子、核小體或其他染色質蛋白結合的其他DNA保護序列而變化。據報導,許多此類染色質片段包含35-80bp範圍內的 cfDNA片段(Snyder et al, 2016)。此外,我們注意到這個大小範圍與核酸酶消化從癌症患者細胞中萃取的染色質所產生的染色質片段的大小範圍相似(Corces et al, 2018)。我們得出結論,這些35-80bp的cfDNA片段比典型的DNA 反應單元長,因此包括側翼DNA序列。然而,與核小體相關的 DNA 片段大小通常超過100bp DNA。因此,我們得出結論,短於 100bp的 cfDNA 片段不包括完整的核小體DNA片段。本發明主要解決的正是這個由轉錄因子和其他DNA結合染色質蛋白組成的染色質片段庫,它們不包含核小體並且與35-80bp大小範圍內的cfDNA片段相連,其中所有或大多數無細胞核小體都是從樣品中去除的,無論它們的連接子DNA組成或組織來源如何。 Gene expression is regulated by the specific binding of transcription factors to short TFBS DNA sequences, also known as response units or binding motifs. The binding site is usually, but not necessarily, located in the gene promoter region near the transcription start site of the regulated gene. Transcription factors bind to DNA in a sequence-specific manner through the DNA Binding Domain (DBD). Typically, the TFBS sequence is 5-15 bp in length within the promoter of its target gene, and transcription factor proteins can usually bind to a group of similar DNA sequences with varying degrees of binding affinity. The length of the DNA segment attached to a circular chromatin segment containing a transcription factor will vary depending on whether the segment also includes other DNA protection sequences bound by other transcription factors, cofactors, nucleosomes, or other chromatin proteins. Many of these chromatin fragments have been reported to contain cfDNA fragments in the 35–80 bp range (Snyder et al , 2016). Furthermore, we noted that this size range is similar to that of chromatin fragments produced by nuclease digestion of chromatin extracted from cancer patient cells (Corces et al , 2018). We conclude that these 35-80 bp cfDNA fragments are longer than typical DNA reaction units and thus include flanking DNA sequences. However, the size of DNA fragments associated with nucleosomes usually exceeds 100 bp DNA. Therefore, we conclude that cfDNA fragments shorter than 100 bp do not include intact nucleosomal DNA fragments. It is this repertoire of chromatin fragments composed of transcription factors and other DNA-binding chromatin proteins that do not contain nucleosomes and that are associated with cfDNA fragments in the 35-80 bp size range, all or most of which are not Nucleosomes are removed from the sample regardless of their linker DNA composition or tissue origin.

據報導,大部分或大部分長度小於100bp的短cfDNA片段並非源自包含調控蛋白在內的染色質片段,而是源自核小體相連的DNA,該DNA的一股或兩股是裂開的或斷掉的。在這種情況下,短cfDNA片段可代表例如與核小體相連的150bp DNA片段,該片段在一或多個位置被切割以產生二或多個更小的cfDNA片段(例如兩個75bp的片段),而不是單個150bp cfDNA片段(Sanchez et al, 2018)。因此,本發明的方法具有去除源自核小體相連的切口性DNA的小於100bp的短cfDNA片段的額外優點。這進一步降低了樣品中與核小體相關的 cfDNA訊號的背景,從而提高了該方法對與轉錄因子(或其他非組蛋白)結合的序列相連的 cfDNA片段的敏感度。 It has been reported that most or most of the short cfDNA fragments less than 100 bp in length do not originate from chromatin fragments including regulatory proteins, but from nucleosome-associated DNA, one or both strands of which are cleaved of or broken. In this case, a short cfDNA fragment may represent, for example, a 150bp DNA fragment attached to a nucleosome that is cleaved at one or more positions to generate two or more smaller cfDNA fragments (e.g. two 75bp fragments ), rather than a single 150bp cfDNA fragment (Sanchez et al , 2018). Thus, the method of the present invention has the additional advantage of removing short cfDNA fragments of less than 100 bp originating from nucleosome-associated nicking DNA. This further reduces the background of nucleosome-associated cfDNA signals in the sample, thereby increasing the sensitivity of the method to cfDNA fragments linked to sequences bound by transcription factors (or other non-histone proteins).

本發明的方法去除具有完整或切口性DNA的核小體DNA,因此優於本領域中基於DNA大小分離(分離的)DNA片段的現有方法,因為其對於高通量應用既昂貴又不切實際,這些方法無法去除來自無細胞核小體切口性DNA的短cfDNA片段。The method of the present invention removes nucleosomal DNA with intact or nicked DNA and is therefore superior to existing methods in the art for separating (isolated) DNA fragments based on DNA size, which are expensive and impractical for high-throughput applications , these methods fail to remove short cfDNA fragments from nicked DNA in cell-free nucleosomes.

本發明的實施例採用去除所有或大部分核小體的方法以處理疾病起源的cfDNA片段,而不論相連的DNA片段大小是否是典型的核小體相連的DNA片段。Embodiments of the present invention employ a method that removes all or most nucleosomes to address cfDNA fragments of disease origin, regardless of the size of the conjoined DNA fragments typical of nucleosome-associated DNA fragments.

本發明的實施例採用去除含有連接子DNA的核小體的方法主要解決長度小於147bp的cfDNA片段大小。In the embodiment of the present invention, the method of removing nucleosomes containing linker DNA is mainly used to solve the size of cfDNA fragments whose length is less than 147 bp.

轉錄因子的反應單元可能重複出現在基因組內的許多位置,並且對於某些轉錄因子而言,在數千個位置出現。因此,相同的轉錄因子有可能結合在細胞染色質內的許多位置。這意味著,原則上,單個細胞的死亡可能會產生大量含有相同轉錄因子的循環染色質片段。Response units for transcription factors may be repeated at many locations within the genome, and for some transcription factors thousands of locations. Thus, the same transcription factor has the potential to bind at many locations within cellular chromatin. This means that, in principle, the death of a single cell could generate large numbers of circulating chromatin fragments containing the same transcription factor.

此外,轉錄因子往往不會單獨起作用,而是與調控特定基因所需的其他轉錄因子或輔因子或其他部分協同作用。因此,轉錄因子可以與大量的不同基因的啟動子中的反應單元結合,每個都與不同的轉錄因子協同工作。因此,圍繞相同或相似的TFBS序列或反應單元的DNA側翼序列,對於相同的轉錄因子而言,在不同基因的啟動子中是不同的,因為其包含不同轉錄因子組合的結合模序。這適用於所有或大多數轉錄因子。In addition, transcription factors often do not act alone, but rather cooperate with other transcription factors or cofactors or other moieties that are required to regulate specific genes. Thus, transcription factors can bind to response units in the promoters of a large number of different genes, each working in concert with a different transcription factor. Thus, the DNA flanking sequences surrounding the same or similar TFBS sequence or response unit, for the same transcription factor, are different in the promoters of different genes because they contain binding motifs for different combinations of transcription factors. This applies to all or most transcription factors.

此外,反應單元本身的結合序列可以是簡併的(degenerate),因此轉錄因子可以結合多種不同的模序序列。例如,轉錄因子TTF-1在健康肺和健康甲狀腺組織中以組織特異性方式表現。在肺中,兩種蛋白質TTF-1因子結合至肺特異性表面活性劑蛋白B(Surfactant Protein B,SPB)基因的啟動子區域。SPB啟動子中TTF-1的DNA結合序列或結合模序是 GCNCTNNAG (SEQ ID NO: 1)(其中 A、C、G 和T分別表示DNA鹼基腺嘌呤、胞嘧啶、鳥嘌呤和胸腺嘧啶,N表示任何該些鹼基)。圍繞TTF-1結合的更廣泛的共有啟動子DNA序列是( 118)GATCAAGCACCTGGAGGGCTCTTCAGAGCAAAGACAAACACTGAGGTCGCTGCCA(-64) (SEQ ID NO: 2),其中,(-64)表示距SPB轉錄起始位點的鹼基對距離。在肺組織的 SPB啟動子中,TTF-1與轉錄因子肝細胞核因子3(Hepatocyte Nuclear Factor 3,HNF3) 結合,如圖 1 所示(Matys et al, 2006和Bohinski et al, 1994)。 In addition, the binding sequence of the response unit itself can be degenerated (degenerate), so transcription factors can bind a variety of different motif sequences. For example, the transcription factor TTF-1 is expressed in a tissue-specific manner in healthy lung and healthy thyroid tissue. In the lung, two proteins, TTF-1 factors, bind to the promoter region of the lung-specific Surfactant Protein B (SPB) gene. The DNA binding sequence or binding motif of TTF-1 in the SPB promoter is GCNCTNNAG (SEQ ID NO: 1) (where A, C, G and T represent the DNA bases adenine, cytosine, guanine and thymine, respectively, N represents any of these bases). The broader consensus promoter DNA sequence surrounding TTF-1 binding is (118)GATCAAGCACCTGGAGGGCTCTTCAGAGCAAAGACAAACACTGAGGTCGCTGCCA(-64) (SEQ ID NO: 2), where (-64) represents the base pair distance from the SPB transcription start site . In the SPB promoter in lung tissue, TTF-1 binds to the transcription factor Hepatocyte Nuclear Factor 3 (HNF3), as shown in Figure 1 (Matys et al , 2006 and Bohinski et al , 1994).

在甲狀腺中,TTF-1調控許多基因,包括甲狀腺球蛋白(thyroglobulin)、促甲狀腺激素受體(thyroid stimulating hormone receptor)和甲狀腺過氧化物酶(thyroperoxidase)。甲狀腺球蛋白基因啟動子區TTF-1的共有結合序列與肺中的不同,且據報為TGGCCACACGAGTGCCCTCA(SEQ ID NO:3)。在甲狀腺球蛋白基因的啟動子中,TTF-1與TTF-2、PAX8以及Runx2轉錄因子協同結合,以及在5'和3'末端的包含50bp側翼序列的更寬序列是CCCACCCCGTTCTGTTCCCCCACAGTTTAGACAAGATCCTCATGCTCCACTGGCCACACGAGTGCCCTCAGGAGGAGTAGACACAGGTGGAGGGAGGCTCCTTTTGACCAGCAGAGAAAAC (SEQ ID NO: 4)。類似地,TTF-1還與促甲狀腺激素受體和甲狀腺過氧化物酶基因的啟動子區域結合,在各情況下與不同的協同轉錄因子協同作用。因此,不僅甲狀腺或肺組織中受調控的基因啟動子序列中的TTF-1結合位點周圍的DNA序列不同,且與TTF-1相連的輔因子也不同,並且因此周圍的DNA序列在同一組織中與不同基因的結合也不同,如圖 1 所示(Matys et al, 2006和Maenhaut et al, 2015)。這證明透過本發明的方法在主體的cfDNA中TFBS序列以及側翼DNA序列的檢測足以鑑定染色質片段的來源為肺或甲狀腺。 In the thyroid, TTF-1 regulates a number of genes, including thyroglobulin, thyroid stimulating hormone receptor, and thyroid peroxidase. The consensus binding sequence for TTF-1 in the promoter region of the thyroglobulin gene is different from that in lung and was reported as TGGCCACACGAGTGCCCTCA (SEQ ID NO: 3). In the promoter of the thyroglobulin gene, TTF-1 cooperates with TTF-2, PAX8, and Runx2 transcription factors, and a broader sequence at the 5' and 3' ends comprising 50 bp flanking sequences is CCCACCCCGTTCTGTTCCCCCACAGTTTAGACAAGATCCTCATGCTCCACTGGCCACACGAGTGCCCTCAGGAGGAGTAGACACAGGTGGAGGGAGGCTCCTTTTGACCAGCAGAGAAAAC (SEQ ID NO: 4). Similarly, TTF-1 also binds to the promoter regions of the thyrotropin receptor and thyroid peroxidase genes, in each case cooperating with different co-transcription factors. Therefore, not only the DNA sequence around the TTF-1 binding site in the promoter sequence of the regulated gene in the thyroid or lung tissue is different, but also the cofactor linked to TTF-1 is different, and therefore the surrounding DNA sequence is different in the same tissue Binding to different genes in is also different, as shown in Figure 1 (Matys et al , 2006 and Maenhaut et al , 2015). This demonstrates that the detection of TFBS sequences and flanking DNA sequences in the subject's cfDNA by the method of the present invention is sufficient to identify the origin of the chromatin fragments as lung or thyroid.

大約有1000-3000個人類轉錄因子,每個轉錄因子結合基因組中的特定位置,導致動態轉錄變化,驅動大量細胞過程。我們以TTF-1為例說明了本發明的原理。然而,原則上可以在本發明的方法中使用任何轉錄因子。甚至,在許多細胞類型中普遍表現並結合離散DNA 序列(discreet DNA)的轉錄因子(例如Hox蛋白轉錄因子)與輔因子協同結合以獨特地結合不同序列以調控不同組織中的不同基因(Merabet and Mann, 2016, Mann et al, 2009)。這意味著所有或大多數轉錄因子都可用於本發明的方法。例如,雌激素受體-α(estrogen receptor-α,ERα) 轉錄因子結合至人類基因組中的一千多個結合位點或雌激素反應單元(ERE),與不同基因組位置的至少60種其他轉錄因子的組合一致(Lin et al, 2007)。類似地,雄激素受體(androgen receptor,AR)結合至與數千個基因相關的雄激素反應單元(ARE),與數千個不同序列位點的其他合作轉錄因子一致。因此,即使這些轉錄因子在多個組織中表現,本發明的方法也可以透過相關DNA的序列鑑定含有ERα或AR的染色質片段的組織來源。這適用於許多轉錄其他轉錄因子,包括CTCF。 There are approximately 1000-3000 human transcription factors, each of which binds to a specific location in the genome, resulting in dynamic transcriptional changes that drive a multitude of cellular processes. We took TTF-1 as an example to illustrate the principle of the present invention. However, any transcription factor can in principle be used in the methods of the invention. Furthermore, transcription factors that are ubiquitously expressed in many cell types and bind discrete DNA sequences (such as Hox protein transcription factors) cooperate with cofactors to uniquely bind different sequences to regulate different genes in different tissues (Merabet and Mann, 2016, Mann et al , 2009). This means that all or most transcription factors can be used in the methods of the invention. For example, the estrogen receptor-α (ERα) transcription factor binds to more than a thousand binding sites or estrogen response elements (EREs) in the human genome, unlike at least 60 other transcription factors at different genomic locations. The combination of factors is consistent (Lin et al , 2007). Similarly, the androgen receptor (AR) binds to androgen response elements (AREs) associated with thousands of genes, consistent with other cooperating transcription factors at thousands of different sequence loci. Therefore, even if these transcription factors are expressed in multiple tissues, the method of the present invention can identify the tissue origin of the chromatin fragment containing ERα or AR through the sequence of the related DNA. This holds true for many other transcription factors, including CTCF.

此外,結合在癌細胞中的DNA基因座通常不同於結合在健康細胞中的那些,因此透過本發明的方法在循環中鑑定含有TFBS序列(選擇性地包括側翼序列)的cfDNA片段能夠識別患有癌症的主體並識別癌症類型,例如前列腺癌或肺癌等(Pomerantz et al2015)。這是因為染色質在腫瘤發生過程中被重構,並且這種重構涉及透過癌細胞中重構的轉錄因子結合模式以上調腫瘤相關蛋白。正因如此,許多轉錄因子的表現在癌細胞中上調。這是一個廣泛的現象,但可以透過一些非限制性示例來舉例說明。例如,眾所周知的癌症相關轉錄因子c-Myc和p53在大多數癌症中上調。由AR結合的結合位點序列在前列腺癌中發生了很大變化(Pomerantz et al2015)。類似地,癌細胞中與轉移作用以及抗治療性相關的上皮間質轉化(EMT),參與轉錄因子Jun/Fos家族的上調,包括Fosll、Fosb、Fos和Junb。亦發現ETS(E26轉化特異性)家族以及Runxl、Tead和Nfkb轉錄因子在腫瘤細胞的開放染色質中高度富集。此外, p63、Klf、Grhl和Cepba據報導在腫瘤細胞中被上調,並且它們的結合位點在開放染色質區域富集。Klf5和p63轉錄因子與癌症相關,並在肺癌和頭頸癌中充當驅動因子。與EMT相關的其他轉錄因子包括bHLH、Runx、Nfat、Tbx1、Tcf7I1和 Smad2(Latil et al, 2017) Furthermore, the DNA loci bound in cancer cells are often different from those bound in healthy cells, thus identification of cfDNA fragments in circulation containing TFBS sequences (optionally including flanking sequences) by the methods of the present invention can identify patients with Cancer subjects and identify cancer types such as prostate or lung cancer (Pomerantz et al 2015). This is because chromatin is remodeled during tumorigenesis, and this remodeling involves upregulation of tumor-associated proteins through remodeled transcription factor binding patterns in cancer cells. As such, the expression of many transcription factors is upregulated in cancer cells. This is a broad phenomenon, but can be illustrated with some non-limiting examples. For example, the well-known cancer-associated transcription factors c-Myc and p53 are upregulated in most cancers. The binding site sequence bound by AR is highly variable in prostate cancer (Pomerantz et al 2015). Similarly, epithelial-mesenchymal transition (EMT), which is associated with metastasis and therapy resistance in cancer cells, involves the upregulation of the Jun/Fos family of transcription factors, including Fosll, Fosb, Fos, and Junb. The ETS (E26 Transformation Specific) family as well as Runxl, Tead and Nfkb transcription factors were also found to be highly enriched in the open chromatin of tumor cells. In addition, p63, Klf, Grhl, and Cepba were reported to be upregulated in tumor cells, and their binding sites were enriched in open chromatin regions. Klf5 and p63 transcription factors are associated with cancer and act as drivers in lung and head and neck cancers. Other transcription factors associated with EMT include bHLH, Runx, Nfat, Tbx1, Tcf7I1, and Smad2 (Latil et al , 2017)

真核生物基因轉錄的調控涉及與多個調控DNA序列結合的多個調控蛋白,這些調控蛋白位於轉錄複合物中基因組的轉錄起始位點(TSS)附近和基因組中TSS的遠端,例如,如圖2所示。DNA中的遠端調控序列可能位於距TSS幾百到一百萬個鹼基的位置,或者可能更遠。轉錄複合物通常涉及DNA環,其可能涉及DNA彎曲蛋白,其中更遠端的調控序列以及與其結合的調控蛋白接觸到與較靠近TSS的調控序列結合的蛋白,例如,如圖2所示。TATA 盒(TATA box)之所以如此命名,是因為它包含與轉錄所需的一般轉錄因子結合的重複胸腺嘧啶/腺嘌呤核苷酸序列。特定基因的表現還需要其他基因特異性轉錄因子(例如,表現表面張力蛋白B、甲狀腺球蛋白、甲狀腺過氧化物酶和TSH受體基因所需的轉錄因子,如圖1所示)。此外,多種其他蛋白質是必需的,包括例如但不限於:輔因子、中介因子、活化子、共活化子、抑制子、共抑制子、染色質重塑蛋白、DNA彎曲蛋白、絕緣子等。這種複合物還可以包括一定長度的核小體保護的DNA。轉錄複合物可以穩定以利大量轉錄。因此,健康和/或疾病來源的循環染色質片段可能包括含有多種蛋白質的大蛋白質/DNA複合物,其可能對核酸酶活性具有抗性。如圖2所示,一些涉及近端和遠端調控序列的大型轉錄複合物被稱為超級增強子。超級增強子是具有高含量轉錄因子結合的大團簇,是驅動牽涉控制細胞特性的基因表現的核心。超級增強子也是癌症中刺激致癌基因轉錄的核心。癌細胞獲得超級增強子,而癌變表型依賴於由超級增強子驅動的異常轉錄。因此,透過本文所述的方法檢測染色質片段的存在,包括全部或部分超級增強子複合物和/或對應於超級增強子的近端和遠端調控序列的cfDNA片段序列的組合,提供了一種鑑定染色質片段的細胞來源的方法,包括來源的癌細胞。The regulation of gene transcription in eukaryotes involves multiple regulatory proteins that bind to multiple regulatory DNA sequences located near the transcription start site (TSS) of the genome in the transcription complex and distal to the TSS in the genome, e.g., as shown in picture 2. Distal regulatory sequences in DNA may be located a few hundred to a million bases from the TSS, or possibly farther. Transcription complexes often involve DNA loops, which may involve DNA bending proteins, where more distal regulatory sequences and the regulatory proteins bound to them contact proteins bound to regulatory sequences closer to the TSS, for example, as shown in Figure 2. The TATA box is so named because it contains repeating thymine/adenine nucleotide sequences that bind general transcription factors required for transcription. Expression of specific genes also requires other gene-specific transcription factors (for example, those required for expression of the surface tensin B, thyroglobulin, thyroid peroxidase, and TSH receptor genes, as shown in Figure 1). In addition, a variety of other proteins are required, including for example but not limited to: cofactors, mediators, activators, coactivators, repressors, co-repressors, chromatin remodeling proteins, DNA bending proteins, insulators, and the like. This complex may also include lengths of nucleosome-protected DNA. The transcription complex can be stabilized to facilitate high-volume transcription. Thus, circulating chromatin fragments of healthy and/or disease origin may include large protein/DNA complexes containing multiple proteins that may be resistant to nuclease activity. As shown in Figure 2, some large transcriptional complexes involving proximal and distal regulatory sequences are called super-enhancers. Super-enhancers are large clusters with high levels of transcription factor binding that are central to driving the expression of genes involved in the control of cellular identity. Super-enhancers are also central to stimulating oncogene transcription in cancer. Cancer cells acquire super-enhancers, and the cancerous phenotype relies on aberrant transcription driven by super-enhancers. Thus, detection of the presence of chromatin fragments, including all or part of super-enhancer complexes and/or combinations of cfDNA fragment sequences corresponding to proximal and distal regulatory sequences of super-enhancers, by the methods described herein provides a Methods of identifying the cellular origin of chromatin fragments, including cancer cells of origin.

這類染色質片段中的DNA環原則上可為完整的,也可在一或多個位置被消化,產生(i)對應於近端和遠端調控序列的兩個循環染色質片段;或(ii)包含兩個DNA片段的大染色質片段。因此,cfDNA可能包括與基因的近端和遠端調控序列相對應的小DNA片段。The DNA loops in such chromatin fragments can in principle be intact, or they can be digested at one or more positions, resulting in (i) two circular chromatin fragments corresponding to proximal and distal regulatory sequences; or ( ii) A large chromatin fragment containing two DNA segments. Thus, cfDNA may include small DNA fragments corresponding to proximal and distal regulatory sequences of genes.

在一實施例中,疾病是選自癌症、自身免疫性疾病或炎性疾病。在另一實施例中,疾病是癌症。在另一實施例中,自身免疫性疾病選自:系統性紅斑狼瘡(SLE)和類風濕性關節炎。在另一實施例中,炎性疾病選自:克羅恩氏病(Crohn’s disease)、結腸炎、子宮內膜異位症和慢性阻塞性肺病(COPD)。In one embodiment, the disease is selected from cancer, autoimmune disease or inflammatory disease. In another embodiment, the disease is cancer. In another embodiment, the autoimmune disease is selected from: systemic lupus erythematosus (SLE) and rheumatoid arthritis. In another embodiment, the inflammatory disease is selected from the group consisting of: Crohn's disease, colitis, endometriosis and chronic obstructive pulmonary disease (COPD).

在較佳實施例中,疾病是癌症。在另一實施例中,癌症選自:乳癌、膀胱癌、結腸直腸癌、皮膚癌、黑色素瘤、卵巢癌、前列腺癌、肺癌、胰臟癌、結腸直腸癌、腸癌、肝癌、子宮內膜癌、淋巴瘤、口腔癌、咽癌、頭頸癌、白血病、淋巴瘤和骨肉瘤。In preferred embodiments, the disease is cancer. In another embodiment, the cancer is selected from the group consisting of breast cancer, bladder cancer, colorectal cancer, skin cancer, melanoma, ovarian cancer, prostate cancer, lung cancer, pancreatic cancer, colorectal cancer, bowel cancer, liver cancer, endometrial cancer cancer, lymphoma, oral cavity cancer, pharyngeal cancer, head and neck cancer, leukemia, lymphoma and osteosarcoma.

在另一實施例中,受疾病影響的組織是起源器官,例如癌症的起源器官。In another embodiment, the tissue affected by the disease is the organ of origin, eg, the organ of origin of a cancer.

在本發明的另一態樣,提供了一種檢測人類或動物主體疾病的方法,包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)擴增分離的DNA,例如透過PCR方法; (iv)確定擴增DNA的序列;和 (v)使用擴增的DNA中存在的轉錄因子結合位點DNA序列以及選擇性的側翼DNA序列作為生物標記物以確定主體中疾病的存在和/或性質。 In another aspect of the present invention, a method of detecting a disease in a human or animal subject is provided, comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) amplifying the isolated DNA, for example by PCR methods; (iv) determine the sequence of the amplified DNA; and (v) Using the transcription factor binding site DNA sequences and optionally flanking DNA sequences present in the amplified DNA as biomarkers to determine the presence and/or nature of a disease in a subject.

本領域技術人員還清楚的是,可以獲得對應於由一或多轉錄因子結合的各種基因座相對應的具有與基因啟動子或其他基因座的多重性相關的側翼序列的多重TFBS序列,並且可以整合關於各種序列的數據以確定疾病的性質和/或受疾病影響的組織。It is also clear to those skilled in the art that multiple TFBS sequences can be obtained corresponding to the various loci bound by one or more transcription factors with flanking sequences associated with the multiplicity of gene promoters or other loci, and can Data on various sequences are integrated to determine the nature of the disease and/or the tissues affected by the disease.

可以使用本領域已知的方法檢測和分析DNA。因此,在一實施例中,透過PCR分析DNA。例如,可以使用PCR方法檢測DNA,例如使用轉接子、簡併引子或序列特異性引子的PCR方法。或者,可以使用雜交方法檢測DNA,例如使用互補序列透過雜交捕獲目標序列。DNA can be detected and analyzed using methods known in the art. Thus, in one embodiment, DNA is analyzed by PCR. For example, DNA can be detected using PCR methods, such as PCR methods using adapters, degenerate primers, or sequence-specific primers. Alternatively, DNA can be detected using hybridization methods, eg, using complementary sequences to capture target sequences by hybridization.

在本發明另一態樣中,提供了一種檢測人類或動物主體疾病的方法,包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)擴增分離的DNA,例如透過PCR方法; (iv)檢測擴增的DNA;和 (v)使用擴增的DNA的存在或數量作為主體疾病存在和/或性質的指標。 In another aspect of the present invention, a method for detecting a disease in a human or animal subject is provided, comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) amplifying the isolated DNA, for example by PCR methods; (iv) detection of amplified DNA; and (v) using the presence or amount of amplified DNA as an indicator of the presence and/or nature of the subject disease.

可以透過本領域已知的任何方法擴增步驟(ii)中分離的DNA序列。在較佳實施例中,使用PCR方法擴增分離的DNA,該PCR方法採用連接至DNA片段的轉接子。在其他實施例中,PCR引子用於DNA擴增。可設計簡併引子以擴增在步驟(ii)中分離的所有DNA序列,或可使用本領域已知的軟體設計特異性引子以擴增與轉錄因子的反應單元的序列相關的特定DNA序列,選擇性地還包括側翼區域。The DNA sequence isolated in step (ii) can be amplified by any method known in the art. In preferred embodiments, the isolated DNA is amplified using a PCR method employing adapters ligated to the DNA fragments. In other embodiments, PCR primers are used for DNA amplification. Degenerate primers can be designed to amplify all DNA sequences isolated in step (ii), or specific primers can be designed using software known in the art to amplify specific DNA sequences related to the sequence of the transcription factor's response unit, Optionally flanking regions are also included.

因此,在本發明另一態樣中,提供了一種檢測人類或動物主體疾病的方法,包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)使用序列特異性引子透過PCR方法擴增分離的DNA; (iv)檢測擴增的DNA;和 (v)使用擴增的DNA 的存在或數量作為主體疾病存在和/或性質的指標。 Therefore, in another aspect of the present invention, a method for detecting a disease in a human or animal subject is provided, comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) amplify the isolated DNA by PCR method using sequence-specific primers; (iv) detection of amplified DNA; and (v) Use the presence or amount of amplified DNA as an indicator of the presence and/or nature of the subject disease.

可透過雜交方法檢測DNA的存在或數量。因此,在本發明一實施例中,提供了一種檢測人類或動物主體疾病的方法,包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)用雜交法檢測 DNA;和 (iv)使用雜交的DNA的存在或數量作為主體中疾病的存在和/或性質的指標。 The presence or amount of DNA can be detected by hybridization methods. Therefore, in one embodiment of the present invention, a method for detecting a disease in a human or animal subject is provided, comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) detection of DNA by hybridization; and (iv) using the presence or amount of hybridized DNA as an indicator of the presence and/or nature of disease in a subject.

在較佳實施例中,在雜交之前擴增分離的DNA。在較佳實施例中,雜交是多重方法,其中將多個DNA序列固定在固相上以同時結合多個TFBS序列,選擇性地包括側翼序列。這允許以單一多重格式測試多個TFBS序列和多種疾病狀況。在較佳的實施例中,多重雜交方法是DNA微陣列或DNA晶片方法。適用於研究多基因序列的任何多重方法都可用於本發明的方法。許多這類方法在本領域中是已知的,包含Luminex珠子法 (Dunbar, 2006)。In preferred embodiments, the isolated DNA is amplified prior to hybridization. In preferred embodiments, the hybridization is a multiplex method in which multiple DNA sequences are immobilized on a solid phase to simultaneously bind multiple TFBS sequences, optionally including flanking sequences. This allows testing of multiple TFBS sequences and multiple disease conditions in a single multiplex format. In preferred embodiments, the multiplex hybridization method is a DNA microarray or DNA chip method. Any multiplex method suitable for studying polygenic sequences can be used in the methods of the invention. Many such methods are known in the art, including the Luminex bead method (Dunbar, 2006).

用於檢測cfDNA樣品中包含TFBS序列的cfDNA片段的存在的進一步方法涉及使cfDNA樣品接觸轉錄因子蛋白本身。然後,轉錄因子將與包含一或多個其TFBS序列的任何DNA序列結合。可以透過本領域已知的任何方法檢測與轉錄因子結合的DNA,包括但不限於使用DNA結合劑(例如,抗-DNA抗體或DNA螯合劑)或透過PCR或雜交方法。在一實施例中,使用通用DNA結合劑例如抗-DNA抗體或DNA螯合劑或嵌入劑,例如溴化乙錠,和花青染料,例如SYBR綠和SYBR金。A further method for detecting the presence of cfDNA fragments comprising TFBS sequences in a cfDNA sample involves contacting the cfDNA sample with the transcription factor protein itself. The transcription factor will then bind to any DNA sequence that contains one or more of its TFBS sequences. DNA bound to a transcription factor can be detected by any method known in the art, including but not limited to the use of DNA binding agents (eg, anti-DNA antibodies or DNA chelators) or by PCR or hybridization methods. In one example, general DNA binding agents such as anti-DNA antibodies or DNA chelators or intercalators such as ethidium bromide, and cyanine dyes such as SYBR green and SYBR gold are used.

舉例而言,在去除核小體後,從主體樣品製備的DNA片段文庫中,前列腺特異性NKX3.1 TFBS序列的存在指出主體對前列腺癌呈陽性。因此,DNA片段文庫可以與固相固定的轉錄因子NKX3.1接觸以結合含有NKX3.1 TFBS序列的DNA片段。來自文庫的DNA與 NKX3.1的結合為前列腺癌的指標。For example, the presence of the prostate-specific NKX3.1 TFBS sequence in a library of DNA fragments prepared from a sample of a subject after nucleosome removal indicates that the subject is positive for prostate cancer. Thus, a library of DNA fragments can be contacted with the solid-phase immobilized transcription factor NKX3.1 to bind DNA fragments containing the NKX3.1 TFBS sequence. Binding of DNA from the library to NKX3.1 is indicative of prostate cancer.

因此,在本發明另一態樣中,提供了一種檢測人類或動物主體疾病的方法,包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟 (i) 中未與結合劑結合的 DNA; (iii)選擇性地,擴增在步驟(ii)中分離的DNA; (iv)使步驟(ii)或(iii)中獲得的DNA接觸轉錄因子蛋白;和 (v)使用與轉錄因子結合的DNA的存在、含量或序列作為主體中疾病的存在和/或性質的指標。 Therefore, in another aspect of the present invention, a method for detecting a disease in a human or animal subject is provided, comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) DNA that was not bound to the binding agent in step (i) was isolated; (iii) optionally, amplifying the DNA isolated in step (ii); (iv) contacting the DNA obtained in step (ii) or (iii) with a transcription factor protein; and (v) using the presence, amount or sequence of DNA bound to a transcription factor as an indicator of the presence and/or nature of a disease in a subject.

在一實施例中,未結合步驟(ii)中分離的核小體結合劑的DNA接觸多種(即多於一種)轉錄因子蛋白,從而捕獲多組TFBS並可在多重測試中分析。這種方法可以在單個患者樣品中測試多種轉錄因子和多種疾病。例如,測試與多種轉錄因子結合的DNA片段,每個轉錄因子對一或多種癌症疾病有特異性,選擇性地,除了在許多癌症中表現的轉錄因子以外,除了鑑定組織之外,還能夠在一次驗血中檢測許多不同的癌症疾病癌症。用於多重測試的方法在本領域中是眾所周知的,例如但不限於,Luminex Corporation的多重珠子系統(multiplex beads system)可用於在單個樣品中進行大量的分離測定(Dunbar,2006)。In one embodiment, DNA not bound to the nucleosome binding agent isolated in step (ii) is contacted with multiple (ie more than one) transcription factor proteins such that multiple sets of TFBS are captured and can be analyzed in a multiplex assay. This approach enables testing of multiple transcription factors and multiple diseases in a single patient sample. For example, testing DNA fragments that bind to multiple transcription factors, each of which is specific for one or more cancer diseases, selectively, in addition to the identification of tissues other than transcription factors that are expressed in many cancers Detect many different cancer diseases cancer in one blood test. Methods for multiplex testing are well known in the art, for example, but not limited to, Luminex Corporation's multiplex beads system can be used to perform a large number of separate assays in a single sample (Dunbar, 2006).

因此,在本發明另一態樣中,提供了一種檢測人類或動物主體疾病的方法,包括步驟: (i)使獲自人類或動物主體的體液樣品接觸與核小體結合的結合劑; (ii)分離步驟(i)中未與結合劑結合的DNA; (iii)選擇性地,擴增在步驟(ii)中分離的DNA; (iv)使步驟(ii)或(iii)中獲得的DNA接觸多種轉錄因子;和 (v)使用與不同轉錄因子結合的DNA 的存在、數量或序列作為主體疾病的存在、性質、位置和/或受影響組織的指標。 Therefore, in another aspect of the present invention, a method for detecting a disease in a human or animal subject is provided, comprising the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) optionally, amplifying the DNA isolated in step (ii); (iv) contacting the DNA obtained in step (ii) or (iii) with a plurality of transcription factors; and (v) Use the presence, amount or sequence of DNA bound to different transcription factors as an indicator of the presence, nature, location and/or affected tissues of the subject disease.

在一實施例中,這裡敘述的方法用於鑑定來源不明的腫瘤的來源組織。這可以在如上所述的體液測試中進行,或者可以在透過將在活檢或手術中獲得的腫瘤組織染色質材料的片段化而產生的染色質片段文庫進行。用於染色質片段化的方法是本領域眾所周知的,包括但不限於通過用核酸酶消化和通過超音波處理。在測試組織的特定情況下,可能不需要在暴露於轉錄因子之前去除核小體(前提是樣品沒有被來自健康細胞的染色質污染)。In one embodiment, the methods described herein are used to identify the tissue of origin of a tumor of unknown origin. This can be performed in body fluid assays as described above, or can be performed on libraries of chromatin fragments generated by fragmentation of tumor tissue chromatin material obtained during biopsy or surgery. Methods for chromatin fragmentation are well known in the art and include, but are not limited to, by digestion with nucleases and by sonication. In the specific case of testing tissue, it may not be necessary to remove nucleosomes prior to exposure to transcription factors (provided the sample is not contaminated with chromatin from healthy cells).

因此,在本發明另一態樣中,提供了一種檢測獲字人或動物主體的組織樣品中的疾病的方法,包括步驟: (i)從組織活檢樣品中分離染色質; (ii)將步驟(i)中分離的染色質片段化; (iii)從步驟(ii)中獲得的染色質片段中萃取DNA; (iv)使步驟(iii)中分離的DNA與接觸一或多個轉錄因子;和 (v)使用與轉錄因子結合的DNA的存在、數量或序列作為主體中疾病的存在和/或起源組織的指標。 Accordingly, in another aspect of the invention, there is provided a method of detecting a disease in a tissue sample from a human or animal subject, comprising the steps of: (i) isolating chromatin from a tissue biopsy sample; (ii) fragmenting the chromatin isolated in step (i); (iii) extracting DNA from the chromatin fragment obtained in step (ii); (iv) contacting the DNA isolated in step (iii) with one or more transcription factors; and (v) Using the presence, amount or sequence of DNA bound to a transcription factor as an indicator of the presence and/or tissue of origin of a disease in a subject.

可以分析獲自主體的體液樣品的cfDNA片段化模式,以檢測疾病並識別受影響的細胞或組織。去除核小體有助於分析活性轉錄因子結合位點周圍的cfDNA片段化模式,同時去除核小體片段化模式的干擾。因此,根據本發明的另一態樣,提供了一種檢測人類或動物主體中疾病的存在和/或起源組織的方法,包括步驟: (i)使獲自主體的體液樣品接觸與核小體結合的結合劑; (ii)分析步驟(i)中未與結合劑結合的DNA,以檢測cfDNA片段化模式;和 (iii)使用全部或部分cfDNA片段化模式作為主體疾病存在和/或起源組織的指標。 The cfDNA fragmentation pattern of bodily fluid samples obtained from a subject can be analyzed to detect disease and identify affected cells or tissues. Removal of nucleosomes facilitates the analysis of cfDNA fragmentation patterns around active transcription factor binding sites while removing nucleosome fragmentation patterns. Therefore, according to another aspect of the present invention, there is provided a method of detecting the presence and/or tissue of origin of a disease in a human or animal subject, comprising the steps of: (i) contacting a sample of bodily fluid obtained from the subject with a binding agent that binds to nucleosomes; (ii) analyzing the DNA in step (i) that has not been bound to the binding agent to detect cfDNA fragmentation patterns; and (iii) Use of total or partial cfDNA fragmentation patterns as indicators of host disease presence and/or tissue of origin.

在較佳實施例中,疾病是癌症。在另一實施例中,疾病的性質是受癌症影響的組織。In preferred embodiments, the disease is cancer. In another embodiment, the nature of the disease is tissue affected by cancer.

本領域眾所周知的是,胎兒來源的cfDNA(例如含有來自(XY)男性胎兒的Y-染色體DNA序列)在懷孕動物和人類(XX)母親的血液中循環。類似地,據報導,該cfDNA包含預期長度為核小體保護的DNA片段(約160bp)的cfDNA片段以及50bp以上範圍內的較短cfDNA片段(Hu et al; 2019)。因此,本發明的方法不僅適用於被採集樣品的主體的疾病狀態,而且適用於母體/胎兒研究,包括母體血液樣品中胎兒狀況的產前測試。 It is well known in the art that cfDNA of fetal origin (eg, containing Y-chromosomal DNA sequences from (XY) male fetuses) circulates in the blood of pregnant animals and human (XX) mothers. Similarly, it has been reported that this cfDNA contains cfDNA fragments of the expected length for nucleosome-protected DNA fragments (approximately 160 bp) as well as shorter cfDNA fragments in the range above 50 bp (Hu et al ; 2019). Thus, the methods of the present invention are applicable not only to the disease state of the subject from which the sample is taken, but also to maternal/fetal research, including prenatal testing of fetal status in maternal blood samples.

因此,在本發明一實施例中,提供了一種在獲自懷孕的人類或動物主體的體液樣品中檢測胎兒疾病狀態的方法,包括步驟: (i)使母體體液樣品接觸與核小體結合的結合劑; (ii)分析步驟 (i) 中未與結合劑結合的 DNA;和 (iii)使用DNA的存在、數量、序列或片段化模式作為該主體的胎兒疾病狀態的指標。 Accordingly, in one embodiment of the present invention, there is provided a method for detecting a fetal disease state in a body fluid sample obtained from a pregnant human or animal subject, comprising the steps of: (i) exposing the sample of maternal body fluid to a binding agent that binds to nucleosomes; (ii) analysis of DNA not bound to the binding agent in step (i); and (iii) using the presence, quantity, sequence, or pattern of fragmentation of DNA as an indicator of the subject's fetal disease state.

核小體結合劑nucleosome binding agent

任何與核小體結合的部分(moiety)都可以用於本發明方法。在本發明較佳實施例中,在分析cfDNA之前去除所有或大部分核小體,核小體結合劑是特異性結合核小體的抗體。抗體可結合至任何核小體表位或核小體的任何組分。在較佳實施例中,所選抗體與存在於所有或大多數循環無細胞核小體中的組分結合,從而在透過本文所述方法進行cfDNA分析之前,去除體液樣品中所有或大部分核小體。Any moiety that binds to nucleosomes can be used in the methods of the invention. In a preferred embodiment of the present invention, all or most of nucleosomes are removed prior to analysis of cfDNA, and the nucleosome binding agent is an antibody that specifically binds to nucleosomes. Antibodies can bind to any nucleosomal epitope or any component of a nucleosome. In preferred embodiments, the selected antibody binds to a component present in all or most circulating cell-free nucleosomes, thereby removing all or most nucleosomes from a bodily fluid sample prior to cfDNA analysis by the methods described herein. body.

在較佳實施例中,核小體結合劑與核小體核心表位結合。核心組蛋白H2A、H2B、H3 和 H4 都具有核心結構域以及長度約為20-30個胺基酸的組蛋白尾部。可以全部或部分去除循環無細胞核小體的組蛋白尾部以產生「截斷的(clipped)」組蛋白。這通常被認為是由參與蛋白質降解起始的內肽酶(endopeptidase)組蛋白酶-L(cathepsin-L)的作用引起的。例如,組蛋白酶-L去除了胺基酸位置21處的組蛋白H3尾部。因此,與組蛋白H3結合(於胺基酸1-21之間的表位)的抗體,對於去除含有組蛋白H3(其中,尾部已被截斷)的核小體可能會失敗。在我們自己的實驗中,我們觀察到與胺基酸位置大於21的表位結合的抗體相比,與組蛋白H3表位結合(在組蛋白尾部於胺基酸位置4-8)的抗體,結合的核小體更少。對於其他核心組蛋白(即H2A、H2B和H4)也有類似的限制。在我們自己的方法開發中,我們使用了結合核心組蛋白H2A和H2B表位以及位於胺基酸30-33的組蛋白H3表位的抗體。In preferred embodiments, the nucleosome binding agent binds to a nucleosome core epitope. Core histones H2A, H2B, H3 and H4 all have a core domain and a histone tail approximately 20-30 amino acids in length. The histone tails of circulating cell-free nucleosomes can be removed in whole or in part to produce "clipped" histones. This is generally thought to be caused by the action of the endopeptidase histone-L (cathepsin-L), which is involved in the initiation of protein degradation. For example, histone-L removes the histone H3 tail at amino acid position 21. Therefore, antibodies that bind to histone H3 (an epitope between amino acids 1-21) may fail to remove nucleosomes containing histone H3 in which the tail has been truncated. In our own experiments, we observed that antibodies binding to histone H3 epitopes (at amino acid positions 4–8 on the histone tail) were less effective than antibodies binding to epitopes greater than amino acid position 21, Fewer nucleosomes are bound. Similar restrictions apply to the other core histones (i.e. H2A, H2B and H4). In our own method development we used antibodies that bind the core histone H2A and H2B epitopes as well as the histone H3 epitope located at amino acids 30-33.

在一實施例中,該方法還包括使用DNA的存在、數量或序列作為主體疾病狀態的指標。因此,在本發明一較佳實施例中,提供了一種檢測獲自人類或動物主體的體液樣品中的疾病狀態的方法,包括步驟: (i)使體液樣品接觸與核小體核心表位結合的結合劑; (ii)分析步驟(i)中未與結合劑結合的DNA;和 (iii)使用DNA的存在、數量、序列或片段化模式作為主體疾病狀態的指標。 In one embodiment, the method further comprises using the presence, amount or sequence of the DNA as an indicator of the subject's disease state. Therefore, in a preferred embodiment of the present invention, a method for detecting a disease state in a body fluid sample obtained from a human or animal subject is provided, comprising the steps of: (i) exposing the bodily fluid sample to a binding agent that binds to a nucleosome core epitope; (ii) analyzing the DNA not bound to the binding agent in step (i); and (iii) Use the presence, quantity, sequence, or fragmentation pattern of DNA as an indicator of the subject's disease state.

在本發明之一實施例中含有連接子DNA的核小體在cfDNA分析之前被去除,與含有連接子DNA的核小體結合的結合劑是全部或部分的包含組蛋白H1部分(moiety)的染色質蛋白或染色質結合蛋白,包括但不限於:染色質解旋酶DNA結合蛋白(CHD)、DNA (胞嘧啶-5)-甲基轉移酶(DNMT)、高遷移率族或高遷移率族框蛋白(HMG或HMGB)、聚[ADP-核糖]聚合酶(PARP)、或含有甲基-CpG結合域(MBD)的蛋白質,例如MECP2。結合劑也可以是抗體或其他結合組蛋白H1的結合劑。In one embodiment of the present invention, nucleosomes containing linker DNA are removed prior to cfDNA analysis, and the binding agent that binds to nucleosomes containing linker DNA is all or part of a histone H1 moiety Chromatin proteins or chromatin-binding proteins, including but not limited to: chromatin helicase DNA-binding protein (CHD), DNA (cytosine-5)-methyltransferase (DNMT), high mobility family or high mobility Group box proteins (HMG or HMGB), poly[ADP-ribose] polymerase (PARP), or proteins containing a methyl-CpG binding domain (MBD), such as MECP2. The binding agent can also be an antibody or other binding agent that binds histone HI.

用於本發明方法的結合劑可以塗佈在固相支持物上,例如瓊脂糖凝膠、葡聚醣凝膠、塑膠或磁珠。在一實施例中,所述固相支持物包含多孔材料。在另一實施例中,將結合劑衍生以包括標籤或連接子,該標籤或連接子可用於將結合劑連接至合適的支持物,其已被衍生為與該標籤結合。許多這類的標籤和支持物是本領域已知的(例如,Sortag、Click Chemistry、生物素/鏈黴親和素、his-標籤/鎳或鈷、GST-標籤/GSH、抗體/表位標籤等等)。然後可以在結合劑與核小體反應之前、同時或之後進行結合劑的分離。為了便於使用,可以將塗佈的支持物包含在裝置中,例如微流體裝置。The binding agent used in the method of the present invention can be coated on a solid support such as agarose gel, dextran gel, plastic or magnetic beads. In one embodiment, the solid support comprises a porous material. In another embodiment, the binding agent is derivatized to include a tag or linker that can be used to attach the binding agent to a suitable support that has been derivatized to bind the tag. Many such tags and supports are known in the art (e.g., Sortag, Click Chemistry, biotin/streptavidin, his-tag/nickel or cobalt, GST-tag/GSH, antibody/epitope tags, etc. Wait). Isolation of the binding agent can then be performed before, simultaneously with, or after reaction of the binding agent with the nucleosome. For ease of use, the coated support can be included in a device, such as a microfluidic device.

在其他實施例中,將結合劑添加到溶液中並透過交聯和用沉澱劑(如聚乙二醇(PEG))以沈澱結合的核小體來分離。然後可以將沉澱的沉澱物分離為分開的相,例如透過離心或過濾。許多免疫沉澱方法是本領域已知的,並且任何這類方法都可用於本發明的方法。In other embodiments, the binding agent is added to the solution and isolated by cross-linking and precipitating the bound nucleosomes with a precipitating agent such as polyethylene glycol (PEG). The precipitated precipitate can then be separated into separate phases, for example by centrifugation or filtration. Many immunoprecipitation methods are known in the art, and any such method can be used in the methods of the invention.

DNA定序DNA sequencing

本領域已知有許多分析或鑑定DNA序列的方法,並且任何DNA分析方法都可用於本發明的方法,包括但不限於:次世代定序方法、等溫DNA擴增、冷PCR(在較低變性溫度下進行共擴增PCR)、MAP(MIDI-活化焦磷酸解)、PARE(個人化重組分析)、DNA 雜交方法(包括基因晶片方法和原位雜交方法)。此外,還可以透過表觀遺傳DNA定序分析來分析基因序列的表觀遺傳改變的DNA序列(例如,對於含有 5-甲基胞嘧啶的序列,使用亞硫酸氫鹽將未修飾的胞嘧啶轉化為尿嘧啶)。因此,於一實施例中,使用DNA定序分析cfDNA,例如選自:次世代定序(靶向或全基因組)和甲基化DNA定序分析、BEAMing、PCR包括數位PCR和冷PCR(在較低變性溫度下進行共擴增-PCR)、等溫擴增、雜交、MIDI-活化焦磷酸分解 (MAP) 或個人化重組分析(PARE)。Many methods of analyzing or identifying DNA sequences are known in the art, and any DNA analysis method can be used in the methods of the present invention, including but not limited to: next generation sequencing methods, isothermal DNA amplification, cold PCR (at lower Co-amplification PCR at denaturing temperature), MAP (MIDI-activated pyrophosphorylation), PARE (Personalized Recombination Analysis), DNA hybridization methods (including gene chip method and in situ hybridization method). In addition, genetically altered DNA sequences can be analyzed by epigenetic DNA sequencing analysis (e.g., for sequences containing 5-methylcytosine, bisulfite is used to convert unmodified cytosine for uracil). Thus, in one embodiment, cfDNA is analyzed using DNA sequencing, for example selected from: next generation sequencing (targeted or whole genome) and methylated DNA sequencing analysis, BEAMing, PCR including digital PCR and cold PCR (in co-amplification-PCR), isothermal amplification, hybridization, MIDI-activated pyrophosphate cleavage (MAP) or personalized recombination analysis (PARE) at lower denaturing temperatures.

DNA文庫製備DNA library preparation

可以擴增去除核小體後存在於樣品中的cfDNA,以便於使用PCR方法進行檢測和定序。用於製備cfDNA片段文庫的方法在本領域中是眾所周知的,並且通常涉及轉接子寡核苷酸與cfDNA片段的連接。然後通常透過PCR擴增轉接子寡核苷酸連接的DNA片段文庫。簡併PCR引子寡核苷酸組也可用於擴增cfDNA。cfDNA present in a sample after nucleosome removal can be amplified for detection and sequencing using PCR methods. Methods for preparing cfDNA fragment libraries are well known in the art and generally involve ligation of adapter oligonucleotides to cfDNA fragments. The adapter oligonucleotide-ligated library of DNA fragments is then amplified, typically by PCR. Degenerate PCR primer oligosets can also be used to amplify cfDNA.

原則上,任何文庫製備方法都可以適用於本發明方法。文庫製備方法可能涉及單股或雙股轉接子連接的cfDNA 片段的擴增。較佳的文庫製備方法涉及單股cfDNA轉接子連接。較佳的文庫製備方法對於長度小於100bp的小DNA片段的擴增和分離效率很高。許多此類文庫製備方法在本領域中是已知的,包含例如,(i) 根據製造商的實驗流程使用TruSeq DNA Sample preparation Kit (Illumina),對於5-10ng的載入DNA進行20-25個 PCR循環(Ulz et al, 2019),(ii)使用MagMAX cfDNA Isolation Kit(Applied Biosystems),然後使用NEBNext Ultra II DNA Library Prep Kit(New England Biolabs)進行文庫製備(Ulz et al, 2019),(iii)使用Qiagen QIAamp DSP DNA Blood Mini Kit血液和體液的實驗流程的使用以及使用Life technologies Ion Plus Fragment Library Kit進行PCR擴增(Hu et al, 2019)。其他方法包括Sanchez et al, 2018、Skene and Henikoff, 2017、Snyder et al, 2016以及 Liu et al, 2019。在較佳實施例中,轉接子寡核苷酸與DNA片段連接並用於擴增文庫中所有轉接子連接的DNA片段。這些方法在本領域中是眾所周知的。 In principle, any library preparation method can be adapted for use in the methods of the invention. Library preparation methods may involve amplification of single- or double-stranded adapter-ligated cfDNA fragments. A preferred library preparation method involves ligation of single-stranded cfDNA adapters. The preferred library preparation method is highly efficient for the amplification and isolation of small DNA fragments less than 100 bp in length. Many such library preparation methods are known in the art, including, for example, (i) 20-25 assays for 5-10 ng of loaded DNA using the TruSeq DNA Sample preparation Kit (Illumina) according to the manufacturer's protocol. PCR cycling (Ulz et al , 2019), (ii) using the MagMAX cfDNA Isolation Kit (Applied Biosystems), followed by library preparation using the NEBNext Ultra II DNA Library Prep Kit (New England Biolabs) (Ulz et al , 2019), (iii) ) using the experimental protocol for blood and body fluids using the Qiagen QIAamp DSP DNA Blood Mini Kit and PCR amplification using the Life technologies Ion Plus Fragment Library Kit (Hu et al , 2019). Other methods include Sanchez et al , 2018, Skene and Henikoff, 2017, Snyder et al , 2016, and Liu et al , 2019. In preferred embodiments, adapter oligonucleotides are ligated to the DNA fragments and used to amplify all of the adapter-ligated DNA fragments in the library. These methods are well known in the art.

用於DNA擴增的PCR引子也可以是隨機序列的,以擴增文庫中存在的所有序列,或者可以使用本領域已知的軟體設計以擴增與轉錄因子的反應單元的序列相連的特定DNA序列,選擇性地也包含側翼區域。PCR primers for DNA amplification can also be of random sequence to amplify all sequences present in the library, or can be designed using software known in the art to amplify specific DNA linked to the sequence of the transcription factor's responsive unit sequence, optionally also containing flanking regions.

或者,可以使用透過本領域已知方法設計的特異性引子寡核苷酸,來擴增特定cfDNA序列,例如與轉錄因子的反應單元相連,選擇性地還包括側翼區域。在此實施例中,可以檢測包含TFBS序列、任選地包括側翼序列的cfDNA片段,而無需定序本身(例如次世代定序)。Alternatively, specific primer oligonucleotides designed by methods known in the art can be used to amplify specific cfDNA sequences, eg, linked to transcription factor response units, optionally including flanking regions. In this example, cfDNA fragments comprising TFBS sequences, optionally including flanking sequences, can be detected without sequencing itself (eg, next-generation sequencing).

樣品製備Sample Preparation

樣品可為任何能檢測到染色質片段的體液。已知染色質片段存在於血液、糞便、尿液和腦脊液中。我們還在痰液中檢測到染色質片段。於較佳實施例中,體液樣品是血液、血清或血漿樣品。在高度較佳實施例中,樣品是血漿樣品,包括收集在EDTA血液收集管中的血漿樣品或收集在推薦用於cfDNA分析的試管中的血漿樣品。此類管包含但不限於Roche、PAXgene、Norgene、LBgard 等生產的無細胞DNA血液收集管。這些樣品可用於測量和分析循環cfDNA片段。例如,血漿樣品(例如EDTA血漿樣品)可用於本發明的方法中。血漿可新鮮使用或冷凍直至分析。在我們自己的方法開發中,我們使用了收集在標準EDTA血液收集管中的血漿樣品,並在2小時內進行了離心。我們的實驗結果指出,無細胞DNA血液收集管也是適用的。The sample can be any bodily fluid in which chromatin fragments can be detected. Chromatin fragments are known to be present in blood, feces, urine, and cerebrospinal fluid. We also detected chromatin fragments in sputum. In preferred embodiments, the bodily fluid sample is a blood, serum or plasma sample. In highly preferred embodiments, the sample is a plasma sample, including plasma samples collected in EDTA blood collection tubes or in tubes recommended for cfDNA analysis. Such tubes include, but are not limited to, cell-free DNA blood collection tubes produced by Roche, PAXgene, Norgene, LBgard, and the like. These samples can be used to measure and analyze circulating cfDNA fragments. For example, plasma samples (eg, EDTA plasma samples) can be used in the methods of the invention. Plasma can be used fresh or frozen until analysis. In our own method development, we used plasma samples collected in standard EDTA blood collection tubes and centrifuged within 2 hours. Our experimental results indicate that cell-free DNA blood collection tubes are also applicable.

轉錄因子與其DNA結合位點transcription factors and their DNA binding sites

真核生物中基因轉錄的調控可能非常複雜,涉及DNA的彎曲和成環,以將被多個調控蛋白結合的多個調控DNA序列匯集到一個調控轉錄複合物中,如圖2所示。因此,本文所用術語「轉錄因子」是指直接或間接結合基因組中的基因調控序列以調控基因轉錄的調控蛋白,包含但不限於:一般轉錄因子和與特定基因調控相關的特定轉錄因子以及增強子、共增強子、抑制子、共抑制子、中介因子、DNA彎曲蛋白、染色質重塑蛋白、DNA損傷修復蛋白、RNA聚合酶蛋白或其他轉錄調控蛋白。類似地,本文所用術語「轉錄因子結合位點」(TFBS)是指與基因的轉錄調控相關的調控蛋白的DNA結合位點,包括但不限於遠端或近端增強子和抑制子序列,如圖2所示。Regulation of gene transcription in eukaryotes can be very complex, involving DNA bending and looping to bring together multiple regulatory DNA sequences bound by multiple regulatory proteins into a regulatory transcription complex, as shown in Figure 2. Therefore, the term "transcription factor" used herein refers to a regulatory protein that directly or indirectly binds to gene regulatory sequences in the genome to regulate gene transcription, including but not limited to: general transcription factors and specific transcription factors related to specific gene regulation and enhancers , co-enhancer, repressor, co-repressor, mediator, DNA bending protein, chromatin remodeling protein, DNA damage repair protein, RNA polymerase protein or other transcriptional regulatory protein. Similarly, the term "transcription factor binding site" (TFBS) as used herein refers to the DNA binding site of a regulatory protein associated with the transcriptional regulation of a gene, including but not limited to distal or proximal enhancer and repressor sequences, such as Figure 2 shows.

TFBS序列的長度通常小於10bp,因此35-80bp的cfDNA片段將覆蓋TFBS側翼序列。本文所用術語「側翼序列」是指存在於基因組中並位於TFBS附近的DNA序列。例如,在TFBS 上游或下游在20或50或 100 或 200bp 內的 DNA 序列。本領域技術人員將清楚的是,基因組中特定TFBS的側翼序列,例如位於基因啟動子序列內,可包括其他調控蛋白的結合位點。The length of the TFBS sequence is usually less than 10bp, so a cfDNA fragment of 35-80bp will cover the TFBS flanking sequence. The term "flanking sequence" as used herein refers to a DNA sequence present in the genome near the TFBS. For example, DNA sequences within 20 or 50 or 100 or 200 bp upstream or downstream of TFBS. It will be clear to those skilled in the art that the flanking sequences of a particular TFBS in the genome, for example within a gene promoter sequence, may include binding sites for other regulatory proteins.

可以透過實驗確定合適的TFBS序列,例如使用經典的核酸酶可及位點作圖方法來鑑定感興趣的組織中的感興趣的轉錄因子相連的DNA序列。在典型的實驗中,從感興趣的細胞(例如癌細胞、相同組織的健康細胞、和造血細胞)中萃取染色質,並使用合適的核酸酶進行消化。將透過消化產生的染色質片段暴露於與感興趣的轉錄因子特異性結合的抗體,分離與抗體結合的DNA片段並定序以確定與轉錄因子結合的TFBS序列(選擇性地包括側翼序列)。經典的核酸酶可及性方法最近得到了改良,該技術現在包括例如CUT&RUN之方法和其他方法,這些方法更易於執行並提供改良的結果(Skene and Henikoff, 2017)。任何這類方法都適用於鑑定用於本發明的合適DNA序列。Appropriate TFBS sequences can be determined experimentally, eg, using classical nuclease-accessible site mapping methods to identify DNA sequences linked to transcription factors of interest in a tissue of interest. In a typical experiment, chromatin is extracted from cells of interest (such as cancer cells, healthy cells of the same tissue, and hematopoietic cells) and digested with an appropriate nuclease. Chromatin fragments generated by digestion are exposed to antibodies that specifically bind the transcription factor of interest, and antibody-bound DNA fragments are isolated and sequenced to determine the TFBS sequence (optionally including flanking sequences) that binds the transcription factor. Classical nuclease accessibility methods have recently been refined and the technique now includes methods such as CUT&RUN and others that are easier to perform and provide improved results (Skene and Henikoff, 2017). Any such method is suitable for use in identifying suitable DNA sequences for use in the present invention.

亦可使用各種基因組、轉錄因子、以及癌症資料庫,來選擇用於本發明方法的合適的轉錄因子和TFBS序列和側翼序列,資料庫諸如:為包括人類在內的許多物種提供註釋基因組序列的ENSEMBL資料庫、 DNA元素百科全書或(ENCODE)資料庫(https://www.encodeproject.org)、轉錄因子 (TRANSFAC) 資料庫(Matys et al, 2006)、基因轉錄調控資料庫(GTRD)18.01版(http://gtrd.biouml.org)、人類轉錄因子資料庫1.01版(http://humantfs.ccbr.utoronto.ca)、NIH基因組學資料共享資料庫(https://gdc.cancer.gov)、癌症基因組圖譜(TCGA) (https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga)、UCSC Xena Browser(https://atcseq.xenahubs.net)和人類蛋白質圖譜資料庫 (https://www.proteinatlas.org),提供在健康組織中表現的轉錄因子和其在癌症疾病中的表現,以及其他資料庫。 Suitable transcription factors and TFBS sequences and flanking sequences for use in the methods of the invention can also be selected using various genome, transcription factor, and cancer databases, such as: , which provides annotated genome sequences for many species, including humans. ENSEMBL database, Encyclopedia of DNA Elements or (ENCODE) database (https://www.encodeproject.org), transcription factor (TRANSFAC) database (Matys et al , 2006), gene transcription regulation database (GTRD) 18.01 version (http://gtrd.biouml.org), Human Transcription Factor Database version 1.01 (http://humantfs.ccbr.utoronto.ca), NIH Genomics Data Commons (https://gdc.cancer. gov), The Cancer Genome Atlas (TCGA) (https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga), UCSC Xena Browser (https://atcseq.xenahubs.net ) and the Human Protein Atlas database (https://www.proteinatlas.org), which provides transcription factors expressed in healthy tissues and their expression in cancer diseases, among others.

這些資料庫的用途(用於本發明方法的轉錄因子以及相連的TFBS序列和側翼序列的特徵化)可以參考這些資料庫中的其中一些以作為示例來說明。TRANSFAC資料庫提供了數以千計的人類和其他真核轉錄因子的資料。為每個轉錄因子提供的詳細訊息包括其在基因組中結合的TFBS的數量、其調控轉錄的基因列表、與各受調控基因相連的TFBS的序列和基因組位置、以協同方式與其一起調控轉錄的其他轉錄因子的詳細訊息、共有TFBS DNA序列、DBD細節以及癌症關聯性。出於說明之目的,本發明上下文中資料的使用在下文中針對轉錄因子CDX2和c-JUN舉例說明。TRANSFAC資料庫列出了調控26個特定基因的48個人類CDX2 TFBS。提供了 CDX2 TFBS序列以及它們的基因組位置和受各序列調控的基因。每個CDX2 TFBS的側翼序列可以透過參考ENSEMBL人類基因組資料庫來確定各基因組位置的序列。還提供了共有CDX2 TFBS序列。類似地,TRANSFAC資料庫列出了265個人類c-JUN TFBS,其調控166個特定基因。提供了c-JUN TFBS序列以及它們的基因組位置和受各序列調控的基因。每個c-JUN TFBS的側翼序列可以透過參考ENSEMBL人類基因組資料庫來確定每個基因組位置的序列。還提供了共有c-JUN TFBS序列。The use of these libraries (characterization of transcription factors and associated TFBS sequences and flanking sequences for use in the methods of the invention) can be illustrated with reference to some of these libraries. The TRANSFAC database provides information on thousands of human and other eukaryotic transcription factors. Detailed information provided for each transcription factor includes the number of TFBSs it binds in the genome, the list of genes it regulates transcription, the sequence and genomic location of the TFBSs associated with each regulated gene, and other genes that regulate transcription with it in a coordinated manner. Details of transcription factors, consensus TFBS DNA sequences, DBD details, and cancer associations. For purposes of illustration, the use of data in the context of the present invention is exemplified below for the transcription factors CDX2 and c-JUN. The TRANSFAC database lists 48 human CDX2 TFBS that regulate 26 specific genes. The CDX2 TFBS sequences are provided along with their genomic locations and the genes regulated by each sequence. The flanking sequence of each CDX2 TFBS can be determined by referring to the ENSEMBL human genome database to determine the sequence of each genomic position. Consensus CDX2 TFBS sequences are also provided. Similarly, the TRANSFAC database lists 265 human c-JUN TFBS that regulate 166 specific genes. The c-JUN TFBS sequences are provided along with their genomic locations and the genes regulated by each sequence. The flanking sequences of each c-JUN TFBS can be sequenced at each genomic position by referring to the ENSEMBL Human Genome Database. The consensus c-JUN TFBS sequence is also provided.

CTCF(亦稱為CCCTC-結合因子)是一種進化上保守的鋅指轉錄因子,其透過11個鋅指的組合與基因組中的大量位點結合,在基因組功能中起關鍵作用。對人類基因組中CTCF結合位點的研究鑑定了19種不同細胞類型的77,811個不同的結合位點(Wang et al, 2012)。在所研究的所有19種細胞類型中,發現77,811個結合位點中有27,662個被佔據。其餘50,149個結合位點的CTCF結合表現出組織特異性。研究的19種細胞類型包括12種正常細胞類型和7種癌症或EBV-永生化細胞株,代表結腸直腸癌(Caco-2)、子宮頸癌 (HeLa-S3)、肝細胞癌(HepG2)、神經母細胞瘤(SK-N-SH_RA)、視網膜母細胞瘤(WERI-RB-1)和EBV-轉化的淋巴質體(GM06990)。發現CTCF在1,236個結合位點的結合對癌細胞株具有特異性。這些結合位點中的1041個的佔據發生在永生化癌細胞株中,但不在正常細胞(包含上皮細胞、成纖維細胞和內皮細胞)中(Liu et al, 2017)。 CTCF (also known as CCCTC-binding factor) is an evolutionarily conserved zinc finger transcription factor, which binds to a large number of sites in the genome through a combination of 11 zinc fingers and plays a key role in genome function. A study of CTCF binding sites in the human genome identified 77,811 distinct binding sites in 19 different cell types (Wang et al , 2012). Across all 19 cell types studied, 27,662 of 77,811 binding sites were found to be occupied. CTCF binding of the remaining 50,149 binding sites exhibited tissue specificity. The 19 cell types studied included 12 normal cell types and 7 cancer or EBV-immortalized cell lines representing colorectal cancer (Caco-2), cervical cancer (HeLa-S3), hepatocellular carcinoma (HepG2), Neuroblastoma (SK-N-SH_RA), retinoblastoma (WERI-RB-1) and EBV-transformed lymphoplasma (GM06990). The binding of CTCF at 1,236 binding sites was found to be specific to cancer cell lines. Occupation of 1041 of these binding sites occurred in immortalized cancer cell lines but not in normal cells (including epithelial cells, fibroblasts, and endothelial cells) (Liu et al , 2017).

因此,可以透過實驗或從文獻和/或資料庫(例如人類蛋白質圖譜資料庫)中選擇轉錄因子和/或TFBS以用於本發明的方法。轉錄因子可能就以下幾點被特徵化(i)它在健康和患病組織中的表現,(ii)在那些細胞或組織中受調控的基因,(iii)它在那些組織中結合的TFBS序列,以及(iv)透過在TFBS上共同結合以進行轉錄調控的其他因子。透過本文描述的方法,該特徵化可用於鑑定體液樣品中染色質片段和/或cfDNA片段的健康或患病組織或細胞來源。Accordingly, transcription factors and/or TFBSs can be selected experimentally or from the literature and/or databases (eg, the Human Protein Atlas database) for use in the methods of the invention. A transcription factor may be characterized in terms of (i) its expression in healthy and diseased tissues, (ii) the genes regulated in those cells or tissues, (iii) the TFBS sequences it binds in those tissues , and (iv) other factors that regulate transcription by co-binding on TFBS. Through the methods described herein, this characterization can be used to identify the healthy or diseased tissue or cell origin of chromatin fragments and/or cfDNA fragments in bodily fluid samples.

類似地,可以使用這些數據庫解釋體液樣品中與染色質片段和/或cfDNA序列有關的實驗數據,以識別包含在cfDNA片段中全部或部分的TFBS序列(選擇性地包括側翼序列)。然後,該資料可用於識別cfDNA片段的組織或細胞來源。Similarly, experimental data related to chromatin fragments and/or cfDNA sequences in bodily fluid samples can be interpreted using these databases to identify all or part of the TFBS sequences (optionally including flanking sequences) contained in the cfDNA fragments. This profile can then be used to identify the tissue or cellular origin of the cfDNA fragments.

此外,文獻中有許多關於轉錄因子和癌症的刊物列出了可用於本發明方法的轉錄因子。例如,Lambert et al, 2018列出了294種已知的致癌轉錄因子和調控因子。Gurel et al, 2010描述轉錄因子NKX3.1為前列腺癌的標記物。Darnell, 2002列出了許多致癌轉錄因子,包括 STAT3、5、STAT-STAT、GR、IRF、TCF/LEF、β-鏈蛋白(β-catenin)、NF-ĸB、NOTCH (NICD)、GLI、c-JUN、bZip蛋白(包括c-JUN、JUNB、JUND、c-FOS、FRA、ATF和 CREB-CREM家族)、cEBP家族、ETS蛋白和MAD-box家族。Vaquerizas et al, 2009描述了許多可用於本發明方法的組織特異性轉錄因子。Ulz et al, 2019描述了轉錄因子,例如上皮轉錄因子GRHL2(其存在於許多癌症類型中,但不存在於血液組織)以及AR(雄激素受體)、NKX3-1和HOXB13中。Corces et al, 2018描述了許多癌症特異性和組織特異性轉錄因子,包括NR5A1、TP63、GRHL1、FOXA1、GATA3、NFIC、CDX2、RFX2、ASCL1、PAX2、HNF1A、NKX2.A、PHOX2B、DRGX、HOXB13、AR、MITF、HNF4和POU5F1。所述參考文獻透過引用併入本文。 In addition, there are many publications in the literature on transcription factors and cancer that list transcription factors that can be used in the methods of the invention. For example, Lambert et al , 2018 listed 294 known oncogenic transcription factors and regulators. Gurel et al , 2010 describe the transcription factor NKX3.1 as a marker for prostate cancer. Darnell, 2002 listed many oncogenic transcription factors, including STAT3, 5, STAT-STAT, GR, IRF, TCF/LEF, β-catenin (β-catenin), NF-ĸB, NOTCH (NICD), GLI, c -JUN, bZip proteins (including c-JUN, JUNB, JUND, c-FOS, FRA, ATF and CREB-CREM families), cEBP family, ETS proteins and MAD-box family. Vaquerizas et al , 2009 describe a number of tissue-specific transcription factors that can be used in the methods of the present invention. Ulz et al , 2019 describe transcription factors such as the epithelial transcription factor GRHL2 (which is present in many cancer types but not blood tissues) as well as AR (androgen receptor), NKX3-1 and HOXB13. Corces et al , 2018 describe a number of cancer-specific and tissue-specific transcription factors, including NR5A1, TP63, GRHL1, FOXA1, GATA3, NFIC, CDX2, RFX2, ASCL1, PAX2, HNF1A, NKX2.A, PHOX2B, DRGX, HOXB13 , AR, MITF, HNF4 and POU5F1. Said references are incorporated herein by reference.

也可以透過實驗確定用於本發明的合適的TFBS序列(選擇性地包括側翼序列)。例如,可以透過實驗確定獲自被診斷為患有或不患有已知疾病狀態的患者的樣品中存在的小(例如35-80bp)cfDNA片段的模式。資料可用於生成TFBS 基因座或 TFBS 基因座模式,它們選擇性存在於獲自患病患者的樣品中。這將會產生疾病的cfDNA TFBS 生物標記物或生物標記物套組特徵。Suitable TFBS sequences (optionally including flanking sequences) for use in the present invention can also be determined experimentally. For example, the pattern of small (eg, 35-80 bp) cfDNA fragments present in samples obtained from patients diagnosed with or without a known disease state can be determined experimentally. The data can be used to generate TFBS loci or patterns of TFBS loci that are selectively present in samples obtained from diseased patients. This will generate a cfDNA TFBS biomarker or biomarker panel signature for disease.

眾所皆知的是,轉錄因子的表現會在疾病中改變。因此,本發明的方法可能涉及轉錄因子的表現,其表現在疾病中被上調,和/或在疾病組織(例如癌組織)中不適當地表現,而在所述(健康)組織中通常不高度表現。It is well known that the expression of transcription factors is altered in disease. Thus, the methods of the present invention may involve the expression of transcription factors that are upregulated in disease and/or inappropriately expressed in diseased tissue (e.g. cancerous tissue) while generally not highly expressed in said (healthy) tissue which performed.

存在於健康主體循環中的染色質片段主要是造血來源的。因此,本發明的方法還涉及循環染色質片段的不當存在,該循環染色質片段包含轉錄因子以及相連的DNA,其在健康造血組織中不表現或以低度表現,但在患病組織或非造血組織中表現。在透過本發明方法去除核小體結合的cfDNA後,藉由與其TFBS相關的cfDNA序列(選擇性地包括側翼DNA序列)的檢測,可推測樣品中包含轉錄因子和相連的DNA的存在。Chromatin fragments present in circulation in healthy subjects are primarily of hematopoietic origin. Thus, the methods of the present invention also involve the inappropriate presence of circulating chromatin fragments comprising transcription factors and associated DNA that are absent or underexpressed in healthy hematopoietic tissues but are expressed in diseased or non- Manifested in hematopoietic tissues. After removal of nucleosome-bound cfDNA by the method of the present invention, the presence of transcription factors and associated DNA in a sample can be inferred by detection of cfDNA sequences (optionally including flanking DNA sequences) associated with its TFBS.

舉例而言,許多癌症疾病源自上皮組織。上皮GRHL2轉錄因子在許多上皮組織以及許多上皮組織衍生的癌症疾病中表現,但在造血組織中不表現。循環中GRHL2的存在表明存在上皮來源的癌症,例如結腸直腸癌、前列腺癌、肺癌或乳癌。因此,本發明的方法可用於檢測癌症本身的存在,這可以與其他TFBS序列(選擇性地還有側翼序列)的分析結合使用,用於體液樣品中的譜系特異性轉錄因子和/或轉錄因子的譜系特異性組合,以鑑定癌症的起源器官。因此,任何轉錄因子藉由其在基因組中的結合位點序列在本發明方法中都是有用的。較佳實施例利用存在於染色質片段中與轉錄因子相連的TFBS序列(選擇性地包括側翼序列),其是以升高的含量存在於患病主體的體液中(超過在其他主體中發現的含量),並且是部分或全部組織和/或疾病特異性的,且在基因組中具有多個反應單元。For example, many cancerous diseases arise from epithelial tissue. The epithelial GRHL2 transcription factor is expressed in many epithelial tissues as well as in many epithelial-derived cancer diseases, but not in hematopoietic tissues. The presence of GRHL2 in the circulation is indicative of cancers of epithelial origin, such as colorectal, prostate, lung or breast cancer. Thus, the method of the invention can be used to detect the presence of cancer itself, which can be used in conjunction with the analysis of other TFBS sequences (and optionally flanking sequences) for lineage-specific transcription factors and/or transcription factors in bodily fluid samples lineage-specific combinations to identify the organ of origin of cancer. Thus, any transcription factor is useful in the methods of the invention by virtue of its sequence of binding sites in the genome. Preferred embodiments utilize TFBS sequences (optionally including flanking sequences) present in chromatin segments associated with transcription factors that are present at elevated levels in the body fluids of diseased subjects (over and above those found in other subjects) content), and are partially or fully tissue and/or disease specific, and have multiple response units across the genome.

因此,在一實施例中,所使用的轉錄因子是疾病特異性的(即,包含其TFBS序列的循環cfDNA片段的含量在疾病中升高)。在一實施例中,轉錄因子是組織特異性的。在一實施例中,轉錄因子結合至基因組中多於一個位置,例如基因組中多於5個、多於10個、多於100個或多於1000個位置。Thus, in one embodiment, the transcription factor used is disease-specific (ie, the levels of circulating cfDNA fragments comprising their TFBS sequences are elevated in disease). In one embodiment, the transcription factor is tissue specific. In one embodiment, the transcription factor binds to more than one location in the genome, eg, more than 5, more than 10, more than 100, or more than 1000 locations in the genome.

轉錄因子可以透過結合域分類(例如參見Vaquerizas et al, 2009,其透過引用併入本文)。在一實施例中,轉錄因子包含選自以下的DNA結合結構域:同源域(homeodomain)、HLH、bZip、NHR、Forkhead、P53、HMG、ETS、aIPT/TIG、POU、MAD、a SAND、IRF、TDP、DM、Heat shock、STAT、CP2、RFX、AP2或鋅指(例如鋅指C2H2或鋅指GATA)結合域。 Transcription factors can be classified by binding domain (see, eg, Vaquerizas et al , 2009, which is incorporated herein by reference). In one embodiment, the transcription factor comprises a DNA binding domain selected from the group consisting of homeodomain, HLH, bZip, NHR, Forkhead, P53, HMG, ETS, aIPT/TIG, POU, MAD, aSAND, IRF, TDP, DM, Heat shock, STAT, CP2, RFX, AP2 or zinc finger (eg zinc finger C2H2 or zinc finger GATA) binding domain.

目前被認為在癌症中特別重要的轉錄因子主要分為三組。第一組是核激素受體組,包括雌激素受體、雄激素受體、孕激素受體、糖皮質激素受體、甲狀腺受體和視黃酸受體。轉錄因子的核激素受體組是細胞表面受體,可被認為是無活性或潛伏的轉錄因子,可以透過配體結合而活化。例如,雌激素受體通過與雌激素結合而被活化。配體結合導致核激素受體遷移到細胞核,在此它與目標DNA序列結合(例如,雌激素受體與雌激素反應單元結合)並上調或下調與DNA目標序列相關的基因(例如,受雌激素調控的基因)。Transcription factors currently considered to be particularly important in cancer fall into three main groups. The first group is the nuclear hormone receptor group, including estrogen receptors, androgen receptors, progesterone receptors, glucocorticoid receptors, thyroid receptors, and retinoic acid receptors. The nuclear hormone receptor group of transcription factors are cell surface receptors that can be considered as inactive or latent transcription factors that can be activated by ligand binding. For example, estrogen receptors are activated by binding to estrogen. Ligand binding causes the nuclear hormone receptor to migrate to the nucleus, where it binds to a target DNA sequence (eg, estrogen receptor binds to an estrogen response unit) and upregulates or downregulates genes associated with the DNA target sequence (eg, estrogen receptor hormone-regulated genes).

已知在癌症的發生和發展中很重要的第二組轉錄因子是訊息傳遞子及活化子(STAT)。這些是潛在的細胞質轉錄因子,可以被細胞質和/或細胞表面的多種分子觸發物活化。STAT 活化通常涉及細胞質中的級聯(cascade)生化事件,例如激酶反應、蛋白水解反應和蛋白質-蛋白質交互作用,這些反應導致蛋白質或蛋白質複合物進入細胞核,從而調控目標基因的轉錄。導致轉錄活化的生化級聯通常由配體在細胞表面的受體結合觸發,包括例如細胞因子受體與細胞因子部分的結合、或生長因子(例如表皮生長因子或血小板衍生的生長因子)與生長因子受體的結合、或胜肽或蛋白與G蛋白偶聯受體的結合。A second group of transcription factors known to be important in the initiation and progression of cancer are the signal transmitters and activators (STATs). These are latent cytoplasmic transcription factors that can be activated by a variety of molecular triggers in the cytoplasm and/or cell surface. STAT activation typically involves a cascade of biochemical events in the cytoplasm, such as kinase reactions, proteolytic reactions, and protein-protein interactions that lead to the entry of proteins or protein complexes into the nucleus to regulate the transcription of target genes. The biochemical cascade leading to transcriptional activation is often triggered by the binding of ligands to receptors on the cell surface, including, for example, the binding of cytokine receptors to cytokine moieties, or the interaction of growth factors (such as epidermal growth factor or platelet-derived growth factor) with growth Binding of factor receptors, or binding of peptides or proteins to G protein-coupled receptors.

在癌症中第三組重要的轉錄因子是常駐核蛋白,其轉錄作用通常由涉及絲胺酸激酶反應的級聯生化事件活化。有數百個絲胺酸激酶部分和數百個核蛋白是絲胺酸激酶的靶標。A third important group of transcription factors in cancer are resident nuclear proteins, whose transcription is often activated by a cascade of biochemical events involving serine kinase responses. There are hundreds of serine kinase moieties and hundreds of nucleoproteins that are targets of serine kinases.

本領域技術人員將清楚的是,cfDNA片段包括(即包含或含有)參與癌症的起始、發展或維持的任何轉錄因子相關的TFBS (例如上述三組中的轉錄因子),在本發明的方法中將會是有用的。一些在癌症中具有已知作用或已知在癌症疾病中升高的轉錄因子或轉錄因子家族包括例如(但不限於):STAT,特別是STAT3、STAT5和STAT-STAT二聚體部分、NF-ƙB、β-鏈蛋白、γ-鏈蛋白、Notch和Notch胞內結構域(NICD)、GLI、c-JUN、JUNB、JUND、c-FOS、FRA、ATF、CREB-CREM、cEBP、ETS、MYC、N-MYC、MAX、E2F、干擾素調控因子(IRF)、T細胞因子(TCF)、淋巴細胞增強因子(LEF)、EN2、GATA3、CDX2、PAX8、WT1、NKX3.1、P63(TP63)或P40、以及螺旋-環-螺旋蛋白(Darnell, 2002)。所有這些轉錄因子在本發明的方法中都將是有用的。It will be clear to those skilled in the art that cfDNA fragments include (i.e. contain or contain) TFBS associated with any transcription factor involved in the initiation, development or maintenance of cancer (such as transcription factors in the above three groups), in the method of the present invention will be useful. Some transcription factors or families of transcription factors that have known roles in cancer or are known to be elevated in cancer disease include, for example (but not limited to): STATs, particularly STAT3, STAT5 and STAT-STAT dimer portions, NF- ƙB, β-catenin, γ-catenin, Notch and Notch intracellular domain (NICD), GLI, c-JUN, JUNB, JUND, c-FOS, FRA, ATF, CREB-CREM, cEBP, ETS, MYC , N-MYC, MAX, E2F, Interferon Regulatory Factor (IRF), T Cell Factor (TCF), Lymphocyte Enhancer Factor (LEF), EN2, GATA3, CDX2, PAX8, WT1, NKX3.1, P63 (TP63) or P40, and helix-loop-helix proteins (Darnell, 2002). All of these transcription factors will be useful in the methods of the invention.

已經發現許多轉錄因子是譜系特異性(lineage specific)的並且與特定和/或組織相關,例如;一種總是或通常在某些組織或癌症中表現但很少或從不在其他組織或癌症中表現的轉錄因子。本發明的方法可用於檢測TFBS序列(選擇性地包括側翼序列),其可用作為組織特異性和/或癌症特異性生物標記物。Many transcription factors have been found to be lineage specific and associated with specific and/or tissues, e.g.; one is always or usually expressed in some tissues or cancers but rarely or never expressed in others transcription factor. The methods of the invention can be used to detect TFBS sequences (optionally including flanking sequences), which can be used as tissue-specific and/or cancer-specific biomarkers.

甲狀腺轉錄因子1(TTF-1)在胚胎發育過程中在甲狀腺、間腦和呼吸道上皮細胞中選擇性表現。TTF-1在取自神經內分泌和非神經內分泌肺癌的組織樣本中表現,但其表達頻率在不同組織學亞型之間顯著不同。因此,透過本發明的方法在ctDNA中發現的TFBS序列也可用於鑑定癌症類型。Thyroid transcription factor 1 (TTF-1) is selectively expressed in thyroid, diencephalon, and airway epithelial cells during embryonic development. TTF-1 is expressed in tissue samples taken from neuroendocrine and non-neuroendocrine lung cancers, but its expression frequency varies significantly between different histological subtypes. Therefore, TFBS sequences found in ctDNA by the method of the present invention can also be used to identify cancer types.

PAX8是參與甲狀腺、腎臟和苗勒氏系統胚胎發育的轉錄因子。PAX8在取自非黏蛋白卵巢癌、漿液性癌、子宮內膜癌、透明細胞癌和移行細胞癌的組織樣品中顯示出大量表現。PAX8也在子宮內膜樣癌、子宮漿液性癌、子宮內膜透明細胞癌以及導管和小葉乳癌組織中表現。PAX8 is a transcription factor involved in embryonic development of the thyroid, kidney and Müllerian system. PAX8 is abundantly expressed in tissue samples taken from non-mucinous ovarian, serous, endometrial, clear cell and transitional cell carcinomas. PAX8 is also expressed in endometrioid, uterine serous, endometrial clear cell, and ductal and lobular breast cancer tissues.

CDX2是一種譜系特異性轉錄因子,在控制腸上皮細胞的增殖和分化中起關鍵作用,並且在幾乎所有結直腸腺癌組織樣品中都有表現。CDX2 is a lineage-specific transcription factor that plays a key role in controlling the proliferation and differentiation of intestinal epithelial cells and is expressed in almost all colorectal adenocarcinoma tissue samples.

NKX3.1是正常前列腺發育所必需的,並且是在幾乎所有前列腺癌中表現的已知標記物。NKX3.1 is required for normal prostate development and is a known marker expressed in almost all prostate cancers.

GATA3早在人類妊娠的第四周就開始轉錄。GATA3在取自乳癌的組織樣品中高度表現,特別是雌激素受體陽性的乳癌組織樣品,以及尿路上皮癌和移行細胞癌。GATA3 transcription begins as early as the fourth week of human gestation. GATA3 is highly expressed in tissue samples obtained from breast cancer, particularly estrogen receptor-positive breast cancer tissue samples, as well as urothelial and transitional cell carcinomas.

WT1在胚胎發育中起重要作用。WT1是卵巢癌組織的良好標記物,並且在有限範圍內的健康成人組織中表現。WT1 plays an important role in embryonic development. WT1 is a good marker for ovarian cancer tissues and is expressed in a limited range of healthy adult tissues.

EN2在胚胎發育中發揮作用,並在一系列癌症中表現,但在少數成人健康組織中表現。尿液中EN2的存在已被用作為檢測前列腺癌的尿液檢測的基礎。EN2 plays a role in embryonic development and is expressed in a range of cancers but in a minority of adult healthy tissues. The presence of EN2 in urine has been used as the basis for a urine test to detect prostate cancer.

其他轉錄因子結合位點可用於本發明的方法。例如,上游結合因子(UBF)是一種轉錄因子,它與核糖體RNA基因啟動子結合且活化由RNA聚合酶I介導的轉錄。已知在某些癌症的組織中的UBF表現升高。許多其他這樣的例子無疑是存在的並且是適合用於本發明方法的轉錄因子。此外,RNA聚合酶I和RNA聚合酶III在癌症中也會升高。這些部分負責tRNA和核糖體RNA基因的轉錄,以提供增高的和快速的蛋白質生產、生長以及癌細胞和組織的細胞複製特徵所需的細胞機制。在本發明另一實施例中,提供了一種用於體液樣品中在無細胞染色質片段中與UBF、RNA聚合酶I或RNA聚合酶III結合相關的檢測或測量。Other transcription factor binding sites can be used in the methods of the invention. For example, upstream binding factor (UBF) is a transcription factor that binds ribosomal RNA gene promoters and activates transcription mediated by RNA polymerase I. UBF expression is known to be elevated in tissues of certain cancers. Many other such examples undoubtedly exist and are suitable transcription factors for use in the methods of the invention. In addition, RNA polymerase I and RNA polymerase III are also elevated in cancer. These segments are responsible for the transcription of tRNA and ribosomal RNA genes to provide the cellular machinery required for increased and rapid protein production, growth, and cellular replication characteristic of cancer cells and tissues. In another embodiment of the present invention, there is provided a method for detecting or measuring the binding of UBF, RNA polymerase I or RNA polymerase III in a cell-free chromatin fragment in a bodily fluid sample.

在一些實施例中,體液染色質片段中蛋白質轉錄因子的存在對特定組織或疾病不是特異性的,因為轉錄因子可在多種細胞和組織類型中表現。因此,本發明的方法還能夠檢測普遍表現的與轉錄因子相連的TFBS,即在超過5、超過10、超過15、超過20或超過30種組織類型中表現的轉錄因子。與此類轉錄因子相連的TFBS序列的檢測在本發明的方法中也是有用的,其中TFBS序列出現在不同的基因組位置,例如在不同的基因啟動子中、不同的組織中或不同的疾病狀況中。因此,TFBS序列和TFBS側翼序列對本發明方法賦予了組織和/或疾病特異性。該實施例的一個優點是這種位置的數量可能很大。例如,1041個CTCF TFBS位置在癌症疾病中是特異性佔據。類似地,對於其他高度表現的轉錄因子(包括例如但不限於c-myc、n-myc、ER、AR、PR和許多其他轉錄因子)發生大量位置的差異化佔據。In some embodiments, the presence of protein transcription factors in bodily fluid chromatin fragments is not specific to a particular tissue or disease because transcription factors can be expressed in a variety of cell and tissue types. Thus, the methods of the invention are also capable of detecting ubiquitously expressed TFBSs associated with transcription factors, ie transcription factors expressed in more than 5, more than 10, more than 15, more than 20 or more than 30 tissue types. Also useful in the methods of the invention is the detection of TFBS sequences linked to such transcription factors, wherein the TFBS sequences occur at different genomic locations, for example in different gene promoters, in different tissues or in different disease states . Thus, the TFBS sequence and TFBS flanking sequences confer tissue and/or disease specificity to the methods of the invention. An advantage of this embodiment is that the number of such locations can be large. For example, 1041 CTCF TFBS positions are specifically occupied in cancer diseases. Similarly, differential occupancy of a large number of positions occurs for other highly expressed transcription factors including, for example, but not limited to, c-myc, n-myc, ER, AR, PR, and many others.

轉錄因子與其DNA目標序列以高度協同的方式與許多其他因子結合,包括其他轉錄因子、輔因子、共活化子、共抑制子、RNA聚合酶部分、延伸因子、染色質重塑因子、中介因子、STAT部分、UBF和其他。這意味著循環染色質片段可能包含更大的基因調控複合物,包括:任何或所有與DNA相連的核小體、核激素受體、類固醇或與核激素結合的其他激素受體、其他轉錄因子、輔因子、共活化子、共抑制子、RNA聚合酶部分、延伸因子、染色質重塑因子、中介因子、STAT部分或與STAT部分(moiety)結合的細胞因子或細胞因子相關因子、上游結合因子(UBF)或與這類基因調控或轉錄複合物相連的任何其他部分。Transcription factors and their DNA target sequences bind in a highly coordinated manner to many other factors, including other transcription factors, cofactors, coactivators, corepressors, RNA polymerase moieties, elongation factors, chromatin remodeling factors, mediators, STAT section, UBF and others. This means that circulating chromatin fragments may contain larger gene regulatory complexes including: any or all DNA-attached nucleosomes, nuclear hormone receptors, steroid or other hormone receptors that bind nuclear hormones, other transcription factors , cofactor, co-activator, co-repressor, RNA polymerase moiety, elongation factor, chromatin remodeling factor, mediator, STAT moiety or cytokine or cytokine-related factor that binds to a STAT moiety, upstream binding factor (UBF) or any other part associated with such gene regulatory or transcriptional complexes.

此外,與染色質中的DNA結合的任何非組蛋白都適用於本發明的方法,包括染色質重塑蛋白、遺傳和表觀遺傳讀取、寫入和刪除蛋白質、參與RNA轉錄的蛋白質(例如RNA聚合酶蛋白)、染色質建築(architectural)蛋白和染色質結構結構(structural)蛋白(例如 DNA 彎曲蛋白)。In addition, any non-histone protein that binds to DNA in chromatin is suitable for use in the methods of the invention, including chromatin remodeling proteins, genetic and epigenetic read, write and delete proteins, proteins involved in RNA transcription (e.g. RNA polymerase proteins), chromatin architectural proteins, and chromatin structural proteins (such as DNA bending proteins).

如本文所用之術語「結合劑」係指配體或結合劑,例如天然存在的、重組的、或化學合成的化合物,其能夠特異性結合核小體。根據本發明的配體或結合劑可包含能與核小體或其他目標特異性結合的胜肽、蛋白質、抗體或其片段、或合成的配體(例如塑性抗體)、或適體或寡核苷酸、或分子拓印表面或裝置。本發明的配體或結合劑可以用可檢測的標記物進行標記,所述標記物例如:發光、螢光、酶或放射性標記物;選擇性地或另外地,可用親和標籤以標記根據本發明的配體,所述親和標籤例如:生物素、抗生物素蛋白、鏈黴親和素、his(例如hexa-His)標籤。於一實施例中,結合劑係選自:抗體、抗體片段或適體。於另一實施例中,使用的結合劑是抗體。術語「抗體」、「結合劑(binding agent)」或「結合劑(binder)」在本文中可互換使用。The term "binding agent" as used herein refers to a ligand or binding agent, such as a naturally occurring, recombinant, or chemically synthesized compound, which is capable of specifically binding to a nucleosome. Ligands or binding agents according to the invention may comprise peptides, proteins, antibodies or fragments thereof, or synthetic ligands (e.g. plastic antibodies), or aptamers or oligonucleotides capable of specifically binding to nucleosomes or other targets. oligonucleotides, or molecular rubbing surfaces or devices. The ligands or binding agents of the invention may be labeled with detectable labels such as luminescent, fluorescent, enzymatic or radioactive labels; alternatively or additionally, affinity tags may be used to label Ligand, the affinity tag such as: biotin, avidin, streptavidin, his (eg hexa-His) tag. In one embodiment, the binding agent is selected from: antibodies, antibody fragments or aptamers. In another embodiment, the binding agent used is an antibody. The terms "antibody", "binding agent" or "binder" are used interchangeably herein.

於一實施例中,樣品是生物液體(其與本文中的術語「體液」可互換使用)。任何體液樣品類型均可用於本發明,包括但不限於:血液、血漿、經血、子宮內膜液、糞便、尿液、唾液、黏液、精液和呼吸(例如冷凝呼吸),或其萃取物或純化物,或稀釋物。生物樣品還包括來自活體或屍檢的樣品。樣品舉例可在適當稀釋或濃縮的情況下製備,並以常規方式儲存。於一較佳實施例中,生物液體樣品係選自:血液或血清或血漿。本領域技術人員將清楚的是,體液中染色質片段的檢測具有不需要活檢的微創方法的優點。In one embodiment, the sample is a biological fluid (which is used interchangeably with the term "body fluid" herein). Any bodily fluid sample type can be used in the present invention, including but not limited to: blood, plasma, menstrual blood, endometrial fluid, feces, urine, saliva, mucus, semen, and breath (e.g. condensed breath), or extracts or purifications thereof substances, or dilutions. Biological samples also include samples from living or autopsies. Samples can for example be prepared at appropriate dilutions or concentrations and stored in a conventional manner. In a preferred embodiment, the biological fluid sample is selected from: blood or serum or plasma. It will be clear to those skilled in the art that detection of chromatin fragments in bodily fluids has the advantage of a minimally invasive approach that does not require biopsy.

於一實施例中,主體為哺乳動物主體。於另一實施例中,主體選自人或動物(例如伴侶動物或小鼠)主體。於再一實施例中,主體是人類主體。於一實施例中,主體是懷孕的。於一實施例中,人主體是非胚胎主體(即處於除了胚胎以外任何發育階段的人)。於另一實施例中,人類主體是成年主體,即大於16歲,例如大於18、21或25歲。於一替代性實施例中,主體是動物主體。於另一實施例中,動物主體選自囓齒動物(例如小鼠、大鼠、倉鼠、沙鼠或花栗鼠)、貓科動物(即貓)、犬科動物(即狗)、馬科動物(即馬)、豬(即豬)或牛(即牛)主體。In one embodiment, the subject is a mammalian subject. In another embodiment, the subject is selected from a human or animal (eg, companion animal or mouse) subject. In yet another embodiment, the subject is a human subject. In one embodiment, the subject is pregnant. In one embodiment, the human subject is a non-embryonic subject (ie, a human being at any stage of development other than an embryo). In another embodiment, the human subject is an adult subject, ie greater than 16 years of age, such as greater than 18, 21 or 25 years of age. In an alternative embodiment, the subject is an animal subject. In another embodiment, the animal subject is selected from the group consisting of rodents (e.g., mice, rats, hamsters, gerbils, or chipmunks), felines (i.e., cats), canines (i.e., dogs), equines (i.e., i.e. horse), pig (i.e. pig) or bovine (i.e. cow) subject.

應當理解的是,本發明的用途和方法可以在體外或離體進行。It should be understood that the uses and methods of the invention can be performed in vitro or ex vivo.

根據本發明另一態樣,提供了一種用於檢測或診斷動物或人類主體的疾病的方法,包括步驟: (i)從獲自主體的體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iii)使用在步驟(ii)中檢測到的DNA含量和/或DNA序列來識別主體的疾病狀態。 According to another aspect of the present invention, there is provided a method for detecting or diagnosing a disease in an animal or human subject, comprising the steps of: (i) removing nucleosomes from a sample of bodily fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; and (iii) using the DNA content and/or DNA sequence detected in step (ii) to identify the disease state of the subject.

在本發明一實施例中,將樣品中DNA片段的存在用於確定所需主體的最佳治療方案。In one embodiment of the invention, the presence of DNA fragments in a sample is used to determine the best treatment regimen for a desired subject.

根據本發明另一態樣,提供了一種用於評估動物或人類主體是否適合進行醫學治療的方法,包括步驟: (i)從獲自主體的體液樣品中去除核小體; (ii)檢測、分析或測量與剩餘樣品中與無細胞染色質片段相連的DNA;和 (iii)使用在步驟(ii)中檢測到的DNA含量和/或DNA序列作為替該主體選擇適合的治療的參數。 According to another aspect of the present invention, there is provided a method for assessing whether an animal or human subject is suitable for medical treatment, comprising the steps of: (i) removing nucleosomes from a sample of bodily fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with fragments of cell-free chromatin in the remaining sample; and (iii) using the DNA content and/or DNA sequence detected in step (ii) as parameters for selecting an appropriate treatment for the subject.

根據本發明的另一態樣,提供了一種用於監測動物或人類主體的治療的方法,其包括以下步驟: (i)從獲自主體的體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA; (iii)在一或多種時機下,重複從獲自該主體的體液樣品中去除核小體後檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iv)使用與步驟(ii)相比在步驟(iii)中檢測到的DNA含量和/或DNA序列的任何變化作為主體狀況任何變化的參數。 According to another aspect of the invention, there is provided a method for monitoring treatment of an animal or human subject comprising the steps of: (i) removing nucleosomes from a sample of bodily fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; (iii) on one or more occasions, repeatedly detecting, analyzing, or measuring DNA associated with cell-free chromatin fragments in the remaining sample after removal of nucleosomes from a sample of bodily fluid obtained from the subject; and (iv) using any change in DNA content and/or DNA sequence detected in step (iii) compared to step (ii) as a parameter for any change in the subject's condition.

在測試樣品中檢測到的與含有轉錄因子的無細胞染色質片段相連的DNA含量和/或DNA序列,與先前在同一個測試主體中獲得的先前測試樣品中所測得的含量或序列相比,所得之變化可作為有益效果的指標,例如所述治療對病症或疑似病症的穩定或改善。此外,一旦治療完成,可以定期重複本發明的方法以監測疾病的複發。The amount and/or DNA sequence detected in a test sample associated with cell-free chromatin fragments containing transcription factors, compared to the amount or sequence measured in a previous test sample previously obtained in the same test subject , the resulting change can be used as an indicator of a beneficial effect, such as stabilization or amelioration of the condition or suspected condition by the treatment. In addition, once treatment is complete, the methods of the invention can be repeated periodically to monitor for recurrence of the disease.

在一實施例中,所述治療係用於治療癌症、自身免疫性疾病或炎性疾病。In one embodiment, the treatment is for the treatment of cancer, autoimmune disease or inflammatory disease.

透過本發明的方法所測得的與TFBS或其他調控結合位點相連的cfDNA序列,可被檢測或測量以作為測量套組之一。因此,在一實施例中,DNA含量和/或DNA序列作為一套組測量之一而被檢測或測量。例如,與其他DNA標記物或任何其他生物標記物組合。The cfDNA sequences linked to TFBS or other regulatory binding sites detected by the method of the present invention can be detected or measured as one of the measurement kits. Thus, in one embodiment, DNA content and/or DNA sequence is detected or measured as one of a set of measurements. For example, in combination with other DNA markers or any other biomarkers.

根據本發明另一態樣,提供了一種用於檢測或測量與非核小體無細胞染色質片段相連的DNA片段中的DNA序列的方法(單獨或作為套組測量的一部分),用於確定或評估動物或人類主體是否適合進行醫學治療,或監測動物或人類主體的治療之目的,例如用於患有實際或疑似癌症或良性腫瘤的主體。According to another aspect of the invention, there is provided a method for detecting or measuring a DNA sequence in a DNA fragment associated with a non-nucleosomal cell-free chromatin fragment (alone or as part of a set of measurements) for determining or For the purpose of assessing the suitability of an animal or human subject for medical treatment, or monitoring the treatment of an animal or human subject, such as in a subject with actual or suspected cancer or benign tumors.

如本文所用之術語「檢測」或「診斷」涵蓋疾病狀態的鑑定、確認及/或特徵化。根據本發明的檢測、監測和診斷方法可用於識別處於疾病高風險中的人(例如,糞便中的血紅蛋白與結腸直腸癌的風險升高有關)以確認疾病的存在,透過評估發作及進程來監測疾病的發展、或評估疾病的改善或消退。檢測、監測和診斷方法也可用於評估臨床篩檢、預後、治療的選擇、評估治療效益的方法,即用於藥物篩選及藥物開發。The term "detection" or "diagnosis" as used herein encompasses the identification, confirmation and/or characterization of a disease state. The detection, monitoring and diagnostic methods according to the present invention can be used to identify persons at high risk of disease (for example, hemoglobin in stool is associated with increased risk of colorectal cancer) to confirm the presence of disease, monitor by assessing onset and progression Development of disease, or assessment of improvement or regression of disease. Detection, monitoring and diagnostic methods can also be used to assess clinical screening, prognosis, treatment selection, methods to assess the effectiveness of treatment, ie for drug screening and drug development.

有效的診斷和監測方法提供了非常強大的「患者解決方案」,具有改善預後的潛力,透過建立正確的診斷,可快速辨別最適合的治療(從而減少暴露於不必要的有害藥物副作用),並降低復發率。Effective diagnostic and monitoring methods provide very powerful "patient solutions" with the potential to improve outcomes, by establishing the correct diagnosis, the most appropriate treatment can be quickly identified (thus reducing exposure to unnecessary harmful drug side effects), and Reduce recurrence rate.

應當理解的是,鑑定和/或定量可透過適合鑑定來自患者的生物樣品或生物樣品的純化物或萃取物或稀釋液中DNA或特定DNA序列的存在和/或含量的任何方法來進行。在本發明方法中,可以透過定序或透過測量一個或多個樣品中TFBS序列的濃度或頻率來進行鑑定和/或定量。可用本發明的方法測試的生物樣品包括如上文所定義的那些。樣品舉例可在適當稀釋或濃縮的情況下製備,並以常規方式儲存。It will be appreciated that identification and/or quantification may be performed by any method suitable for identifying the presence and/or amount of DNA or specific DNA sequences in a biological sample or a purification or extract or dilution of a biological sample from a patient. In the methods of the invention, identification and/or quantification can be performed by sequencing or by measuring the concentration or frequency of the TFBS sequence in one or more samples. Biological samples that can be tested with the methods of the invention include those as defined above. Samples can for example be prepared at appropriate dilutions or concentrations and stored in a conventional manner.

可以直接檢測TFBS特異性DNA片段。或者,它可以透過能特異性結合TFBS特異性DNA片段的一配體或多個配體間的交互作用被直接或間接測得,該配體例如DNA分子、轉錄因子或其他配體或其片段。適合的配體包括可以透過雜交結合cfDNA的互補序列的DNA分子。配體或結合劑可具有可檢測標記,例如發光、螢光或放射性標記,和/或親和標籤。TFBS-specific DNA fragments can be detected directly. Alternatively, it can be measured directly or indirectly through the interaction of a ligand or ligands that specifically bind to a TFBS-specific DNA segment, such as a DNA molecule, transcription factor, or other ligand or fragment thereof . Suitable ligands include DNA molecules that can bind the complement of cfDNA by hybridization. A ligand or binding agent may have a detectable label, such as a luminescent, fluorescent or radioactive label, and/or an affinity tag.

舉例而言,檢測和/或定量可透過一或多種方法進行,其係選自由:PCR、DNA定序、基因晶片雜交分析或透過SELDI(-TOF)、MALDI(-TOF)、一維電泳分析(1-D gel-based analysis)、二維電泳分析(2-D gel-based analysis)、質譜(MS)、逆相(RP)LC、尺寸滲透(凝膠過濾)、離子交換、親和力、HPLC、UPLC和其他基於LC或LC MS的技術所組成的群組。適合的LC MS技術包括ICAT® (Applied Biosystems, CA, USA)或iTRAQ® (Applied Biosystems, CA, USA)。也可以使用液相層析(例如高壓液相層析(HPLC)或低壓液相層析(LPLC))、薄層層析、NMR(核磁共振)光譜法。For example, detection and/or quantification can be performed by one or more methods selected from: PCR, DNA sequencing, gene chip hybridization analysis or analysis by SELDI (-TOF), MALDI (-TOF), one-dimensional electrophoresis (1-D gel-based analysis), two-dimensional electrophoresis analysis (2-D gel-based analysis), mass spectrometry (MS), reverse phase (RP) LC, size permeation (gel filtration), ion exchange, affinity, HPLC , UPLC, and other LC- or LC-MS-based techniques. Suitable LC-MS techniques include ICAT® (Applied Biosystems, CA, USA) or iTRAQ® (Applied Biosystems, CA, USA). Liquid chromatography (eg high pressure liquid chromatography (HPLC) or low pressure liquid chromatography (LPLC)), thin layer chromatography, NMR (nuclear magnetic resonance) spectroscopy may also be used.

應當理解的是,檢測和/或測量DNA可包括例如本文所述的雜交或定序。It should be understood that detecting and/or measuring DNA may include, for example, hybridization or sequencing as described herein.

使用如本文所述的免疫學方法(包括免疫沉澱和去除核小體),可能涉及選擇性結合至核小體的任何部分(moiety),包含能夠與核小體特異性結合的抗體、或其片段、或核小體結合染色質蛋白或胜肽、或能夠特異性結合至核小體的加工的結合劑。Using immunological methods as described herein, including immunoprecipitation and removal of nucleosomes, may involve selective binding to any moiety of nucleosomes, including antibodies capable of specifically binding to nucleosomes, or Fragments, or nucleosome-binding chromatin proteins or peptides, or processed binders capable of specifically binding to nucleosomes.

如本文所述結合含有連接子DNA的核小體的結合劑部分(moiety)的使用可包含選擇性結合含有連接子DNA的核小體的任何部分,包括天然衍生的蛋白質或胜肽、表現的蛋白質、加工蛋白質或再加工蛋白質。此外,可能不需要使用完整的蛋白質並且可以使用截斷的蛋白質或胜肽。The use of binder moieties that bind linker DNA-containing nucleosomes as described herein may comprise selectively binding any moiety of linker DNA-containing nucleosomes, including naturally derived proteins or peptides, expressed Protein, processed protein, or reprocessed protein. Furthermore, it may not be necessary to use intact proteins and truncated proteins or peptides can be used.

根據本發明另一態樣,提供了藉由本文所述之方法鑑定出的生物標記物。According to another aspect of the invention there is provided a biomarker identified by the methods described herein.

本文提供了用於實施本發明方法的診斷或監測試劑盒。這類試劑盒將適當地包含核小體結合劑,以及用於DNA分離、DNA文庫製備、DNA擴增的試劑的選擇性試劑,以及用於DNA定序或分析的選擇性試劑,選擇性地還有用於檢測和/或定量目標cfDNA或生物標記物的配體,選擇性地連同使用該試劑盒的說明書。生物標記物監測方法、生物感測器和試劑盒作為患者監測工具也是至關重要的,使醫生能夠確定復發是否是因為疾病的惡化。若藥物治療經評估後為不充分的,則可以恢復或增加治療;如果合適的話可以改變治療方法。由於生物標記物對疾病狀態敏感,它們提供了藥物治療影響的指標。Provided herein are diagnostic or monitoring kits for practicing the methods of the invention. Such kits will suitably contain nucleosome binding agents, and selective reagents for DNA isolation, DNA library preparation, DNA amplification reagents, and selective reagents for DNA sequencing or analysis, optionally There are also ligands for detecting and/or quantifying the cfDNA or biomarker of interest, optionally together with instructions for using the kit. Biomarker monitoring methods, biosensors, and kits are also critical as patient monitoring tools, enabling physicians to determine whether relapses are due to worsening disease. If medical therapy is assessed to be inadequate, therapy may be resumed or increased; therapy may be altered if appropriate. As biomarkers are sensitive to disease state, they provide an indicator of the impact of drug treatment.

根據本發明另一態樣,提供了用於檢測cfDNA片段序列的試劑盒,其包含核小體結合劑和用於擴增和/或定序與所述cfDNA序列相關的DNA的試劑,選擇性地還有根據本文所述之方法使用該試劑盒的說明書。According to another aspect of the present invention, there is provided a kit for detecting the sequence of a cfDNA fragment, which comprises a nucleosome binding agent and a reagent for amplifying and/or sequencing DNA related to the cfDNA sequence, selectively Also provided are instructions for using the kit according to the methods described herein.

本發明另一態樣是用於檢測疾病狀態存在的試劑盒,其包括能夠檢測和/或定量本文定義的一或多個生物標記物的生物感測器。Another aspect of the invention is a kit for detecting the presence of a disease state comprising a biosensor capable of detecting and/or quantifying one or more biomarkers as defined herein.

根據另一態樣,提供了本文所定義之試劑盒用於診斷癌症之用途。根據另一態樣,提供了本文所定義之試劑盒用於診斷炎性疾病之用途。根據另一態樣,提供了本文所定義之試劑盒用於診斷產前疾病之用途。According to another aspect, there is provided the use of a kit as defined herein for the diagnosis of cancer. According to another aspect, there is provided a kit as defined herein for use in the diagnosis of an inflammatory disease. According to another aspect, there is provided the use of a kit as defined herein for the diagnosis of a prenatal disease.

根據另一態樣,提供了一種治療所需主體中疾病的方法,其中,所述方法包括以下步驟: (a)使獲自人類或動物主體的體液樣品接觸與特異性核小體結合的結合劑; (b)檢測或測量未與步驟(a)中的結合劑結合的DNA片段; (c)使用DNA片段的存在、序列或數量作為主體中疾病存在的指標;和 (d)如果在步驟(c)中確定主體患有疾病,則給予治療。 According to another aspect, there is provided a method of treating a disease in a subject in need thereof, wherein the method comprises the steps of: (a) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to specific nucleosomes; (b) detecting or measuring DNA fragments that are not bound to the binding agent in step (a); (c) using the presence, sequence or quantity of DNA fragments as an indicator of the presence of disease in a subject; and (d) administering treatment if it is determined in step (c) that the subject has the disease.

在一實施例中,疾病是癌症、自身免疫性疾病或炎性疾病(例如,如上文所述)。在另一實施例中,疾病是癌症。In one embodiment, the disease is cancer, an autoimmune disease or an inflammatory disease (eg, as described above). In another embodiment, the disease is cancer.

在一實施例中,所施用的治療係選自:手術、放射療法、化學療法、免疫療法、激素療法和生物療法。In one embodiment, the therapy administered is selected from the group consisting of surgery, radiation therapy, chemotherapy, immunotherapy, hormone therapy and biological therapy.

根據本發明另一態樣,提供了一種治療所需主體中癌症的方法,其中,所述方法包括以下步驟: (a)根據本文所述的方法檢測或診斷主體的癌症;然後 (b)對所述個體進行抗癌療法、手術或藥物。 According to another aspect of the present invention, there is provided a method of treating cancer in a subject in need thereof, wherein said method comprises the steps of: (a) detecting or diagnosing cancer in the subject according to the methods described herein; then (b) subjecting said individual to anticancer therapy, surgery or drugs.

在一實施例中,主體是人或動物主體。In one embodiment, the subject is a human or animal subject.

我們現在用以下實施例說明本發明。We now illustrate the invention by the following examples.

實施例1Example 1

我們用抗體塗覆Dynabeads M280 Tosyl 活化性磁珠,該抗體可與位於胺基酸位置30-33的組蛋白H3表位結合。該抗體選自許多測試的抗體,因為觀察到它與含有完整組蛋白尾部的核小體和具有切斷的組蛋白尾部的核小體結合。We coat Dynabeads M280 Tosyl Activated Magnetic Beads with an antibody that binds to the histone H3 epitope at amino acid positions 30-33. This antibody was selected from many tested antibodies because it was observed to bind to nucleosomes containing intact histone tails and to nucleosomes with truncated histone tails.

我們將抗-H3抗體塗佈的磁珠(1mg)添加到含有一系列濃度的重組單核小體(0.5ml)的溶液中。在室溫下將珠子與核小體一起培養1小時,同時輕輕滾動試管以保持珠子處於懸浮狀態。磁珠被磁性分離並洗滌。然後透過洗脫去除吸附到珠子上的核小體並透過西方墨點法分析。結果顯示,磁珠以劑量依賴性方式從溶液中吸附核小體,如圖3所示。We added anti-H3 antibody-coated magnetic beads (1 mg) to solutions containing a range of concentrations of recombinant mononucleosomes (0.5 ml). Incubate the beads with nucleosomes for 1 hr at room temperature while gently rolling the tube to keep the beads in suspension. The beads are magnetically separated and washed. Nucleosomes adsorbed to the beads were then removed by elution and analyzed by western blotting. The results showed that magnetic beads adsorbed nucleosomes from solution in a dose-dependent manner, as shown in Figure 3.

實施例2Example 2

如實施例1所述製備和使用抗-H3抗體塗佈的磁珠。我們將抗-H3抗體塗佈的磁珠以及未塗佈的磁珠添加到8個人類EDTA血漿樣品以及含有一系列濃度的重組單核小體的溶液中。選擇重組單核小體濃度範圍以包含通常在人類臨床樣品中觀察到的含量。Anti-H3 antibody coated magnetic beads were prepared and used as described in Example 1. We added anti-H3 antibody coated magnetic beads as well as uncoated magnetic beads to 8 human EDTA plasma samples and solutions containing a range of concentrations of recombinant mononucleosomes. The recombinant mononucleosome concentration range was chosen to encompass levels typically observed in human clinical samples.

本發明較佳實施例涉及在DNA分析之前去除樣品中存在的所有或大部分核小體。因此,我們為了檢測在與磁珠培養後留在溶液中的核小體的存在,使用了針對核小體的ELISA,具有光密度(OD)讀數。圖4中顯示的結果表明,在用抗-H3抗體塗佈的磁珠吸附後,溶液中剩餘的重組單核小體的含量是不可檢測的(與不含核小體的對照溶液具有相似的OD),而用未塗佈的磁珠培養的溶液則不受影響,產生標準化ELISA劑量反應曲線。類似地,在用抗-H3抗體塗佈的磁珠吸附後,測試的8個人類血漿樣品中溶液中剩餘的核小體含量也很低或檢測不到,但不受用未塗佈的磁珠培養所影響。這些結果表明,可以使用本發明的方法從人類血漿樣品中定量去除核小體。Preferred embodiments of the invention involve removing all or most of the nucleosomes present in the sample prior to DNA analysis. Therefore, to detect the presence of nucleosomes left in solution after incubation with magnetic beads, we used an ELISA against nucleosomes with an optical density (OD) readout. The results shown in Figure 4 indicate that after adsorption with anti-H3 antibody-coated magnetic beads, the amount of recombinant mononucleosomes remaining in the solution was undetectable (similar to that of the control solution without nucleosomes). OD), while solutions incubated with uncoated magnetic beads were not affected, resulting in a normalized ELISA dose-response curve. Similarly, the amount of nucleosomes remaining in solution was also low or undetectable in eight human plasma samples tested after adsorption with anti-H3 antibody-coated magnetic beads, but not affected by uncoated magnetic beads. The influence of cultivation. These results demonstrate that it is possible to quantitatively remove nucleosomes from human plasma samples using the method of the present invention.

實施例3Example 3

血漿樣品取自健康主體和患有多種癌症疾病的主體,包含但不限於肺癌、結腸癌、直腸癌、乳癌、前列腺癌、肝癌、腎癌、膀胱癌、甲狀腺癌、頭頸癌、口腔癌、咽喉癌、食道癌、胃癌、卵巢癌、子宮癌、子宮內膜癌、皮膚癌和造血組織癌(淋巴瘤和白血病)。如實施例2所述耗盡樣品中的核小體,並且分析剩餘的血漿樣品。從核小體耗盡的血漿樣品中分離DNA,將其擴增以產生文庫並定序。分析DNA定序結果以鑑定轉錄因子結合位點(TFBS)序列以及側翼序列,這些序列在取自癌症患者的樣品中選擇性地以升高的含量存在,但在取自健康的患者的樣品中不存在或以低含量存在。這些DNA序列中的一些存在於取自多種癌症疾病類型的樣品中。其他DNA序列存在於取自患有特定器官癌症或特定類型癌症的患者的樣品中。結果用於選擇用於本發明方法的轉錄因子和TFBS序列以及側翼序列,用於與癌症本身相關或與特定癌症疾病類型相關。Plasma samples were taken from healthy subjects and subjects with various cancer diseases, including but not limited to lung cancer, colon cancer, rectal cancer, breast cancer, prostate cancer, liver cancer, kidney cancer, bladder cancer, thyroid cancer, head and neck cancer, oral cancer, throat cancer Cancer of the esophagus, stomach, ovary, uterus, endometrium, skin and hematopoietic tissues (lymphoma and leukemia). Nucleosomes in the samples were depleted as described in Example 2, and the remaining plasma samples were analyzed. DNA was isolated from nucleosome-depleted plasma samples, amplified to generate libraries and sequenced. Analysis of DNA sequencing results to identify transcription factor binding site (TFBS) sequences and flanking sequences that are selectively present at elevated levels in samples from cancer patients but not in samples from healthy patients Absent or present in low levels. Some of these DNA sequences were present in samples taken from a variety of cancer disease types. Other DNA sequences are present in samples taken from patients with cancer of a particular organ or type of cancer. The results are used to select transcription factors and TFBS sequences and flanking sequences for use in the methods of the invention, for association with cancer itself or with specific cancer disease types.

實施例4Example 4

重複實施例3所述之實驗,但分析DNA定序結果以尋找特徵化癌症或特定癌症疾病類型的染色質片段化模式。The experiments described in Example 3 were repeated, but the DNA sequencing results were analyzed for patterns of chromatin fragmentation that characterize cancer or specific cancer disease types.

實施例5Example 5

血漿樣品是取自健康主體和患有前列腺癌的主體。如實施例2中所述,樣品中的核小體已被耗盡。然後從血漿樣品中分離DNA,使用次世代定序儀器進行擴增和定序。分析定序結果中TFBS以及轉錄因子NKX3.1和GRHL2的側翼序列的存在。在取自前列腺癌患者的血漿樣品中檢測到NKX3.1和GRHL2 TFBS序列,但在取自健康主體的樣品中未檢測到或檢測到低含量。Plasma samples were taken from healthy subjects and subjects with prostate cancer. Nucleosomes in the samples were depleted as described in Example 2. DNA is then isolated from the plasma sample, amplified and sequenced using a next-generation sequencing instrument. The sequencing results were analyzed for the presence of TFBS and flanking sequences for the transcription factors NKX3.1 and GRHL2. NKX3.1 and GRHL2 TFBS sequences were detected in plasma samples taken from prostate cancer patients but not or at low levels in samples taken from healthy subjects.

實施例6Example 6

重複實施例5所述之實驗,但是使用多個序列特異性引子對於分離的DNA進行擴增,所述引子被設計用於擴增多個啟動子序列,包括TFBS和轉錄因子NKX3.1以及GRHL2的側翼序列。結果顯示,包含至少一種擴增的TFBS序列的DNA含量在取自前列腺癌患者的樣品中較高,而在取自健康主體的樣品中較低。The experiment described in Example 5 was repeated, but the isolated DNA was amplified using sequence-specific primers designed to amplify multiple promoter sequences, including TFBS and the transcription factors NKX3.1 and GRHL2 flanking sequence. The results showed that the content of DNA comprising at least one amplified TFBS sequence was higher in samples taken from prostate cancer patients and lower in samples taken from healthy subjects.

實施例7Example 7

進行了類似於實施例6中所述之實驗,但係使用肺癌樣品以及與TTF-1和GRHL2相連的TFBS以及側翼序列。結果顯示,包含至少一個擴增的TFBS序列的DNA含量在取自肺癌患者的樣品中較高,而在取自健康主體的樣品中較低。An experiment similar to that described in Example 6 was performed, but using lung cancer samples and TFBS linked to TTF-1 and GRHL2 and flanking sequences. The results showed that the content of DNA comprising at least one amplified TFBS sequence was higher in samples taken from lung cancer patients and lower in samples taken from healthy subjects.

實施例8Example 8

進行類似於實施例6所述之實驗,但使用結腸直腸癌樣品和以及與CDX-2和GRHL2相連的TFBS和側翼序列。結果顯示,包含至少一個擴增的 TFBS序列的DNA含量在取自結腸直腸癌患者的樣品中較高,而在取自健康主體的樣品中較低。An experiment similar to that described in Example 6 was performed, but using colorectal cancer samples and TFBS and flanking sequences linked to CDX-2 and GRHL2. The results showed that the DNA content comprising at least one amplified TFBS sequence was higher in samples taken from colorectal cancer patients and lower in samples taken from healthy subjects.

實施例9Example 9

進行了類似於實施6所述之實驗,但使用乳癌樣品和與GATA3和GRHL2相連的TFBS以及側翼序列。結果顯示,包含至少一個擴增的TFBS序列的 DNA含量在取自乳癌患者的樣品中較高,而在取自健康主體的樣品中則較低。An experiment similar to that described in Example 6 was performed, but using breast cancer samples and TFBS linked to GATA3 and GRHL2 and flanking sequences. The results showed that the amount of DNA comprising at least one amplified TFBS sequence was higher in samples taken from breast cancer patients and lower in samples taken from healthy subjects.

實施例10Example 10

重複實施例5所述之實驗,但分離的DNA是接觸磁性固相固定的轉錄因子NKX3.1和固定的轉錄因子GRHL2。透過PCR測量與兩種磁性轉錄因子結合的DNA含量。結果顯示,包含至少一種擴增的TFBS序列的DNA含量在取自前列腺癌患者的樣品中較高,而在取自健康主體的樣品中則較低。The experiment described in Example 5 was repeated, but the isolated DNA was contacted with the immobilized transcription factor NKX3.1 and the immobilized transcription factor GRHL2 on the magnetic solid phase. The amount of DNA bound to the two magnetic transcription factors was measured by PCR. The results showed that the amount of DNA comprising at least one amplified TFBS sequence was higher in samples taken from prostate cancer patients and lower in samples taken from healthy subjects.

實施例11Example 11

血漿樣品是取自健康主體和患有前列腺癌、乳癌或肺癌的主體。如實施例2中所述,樣品中的核小體被耗盡。然後從血漿樣品中分離DNA,將其擴增並與固定在Luminex珠上的多種轉錄因子接觸。根據製造商的實驗流程,轉錄因子NKX3.1、GATA3、TTF-1、CDX-2和 GRHL2分別固定在不同顏色的珠子上。使用標記的抗-DNA抗體來測量與各轉錄因子結合的 DNA含量。結果顯示,與塗有NKX3.1和GRHL2的珠子結合的DNA含量在取自前列腺癌患者的樣品中升高,而與其他珠子的結合量較低;與塗有GATA3和GRHL2的珠子結合的 DNA含量在取自乳腺癌患者的樣品中升高,而與其他珠子的結合量較低;但與塗有TTF-1 和GRHL2的珠子結合的DNA含量在取自肺癌患者的樣品中升高,而與其他珠子的結合量較低。相反的,與所有珠子的結合量在取自健康主體的樣品中較低。Plasma samples were taken from healthy subjects and subjects with prostate, breast or lung cancer. Nucleosomes in the samples were depleted as described in Example 2. DNA was then isolated from the plasma samples, amplified and contacted with various transcription factors immobilized on Luminex beads. Transcription factors NKX3.1, GATA3, TTF-1, CDX-2, and GRHL2 were immobilized on beads of different colors according to the manufacturer's protocol. Labeled anti-DNA antibodies were used to measure the amount of DNA bound to each transcription factor. The results showed that the amount of DNA bound to beads coated with NKX3.1 and GRHL2 was elevated in samples taken from prostate cancer patients, while the amount bound to other beads was lower; DNA bound to beads coated with GATA3 and GRHL2 The amount of DNA bound to beads coated with TTF-1 and GRHL2 was elevated in samples taken from patients with breast cancer, while the amount bound to other beads was lower; Binding to other beads is low. In contrast, the amount bound to all beads was lower in samples taken from healthy subjects.

實施例12Example 12

重複實施例11所述之實驗,結果相似,但是透過PCR測量固定的NKX3.1、GATA3、TTF-1、CDX-2和GRHL2結合的DNA。The experiment described in Example 11 was repeated with similar results, but immobilized NKX3.1, GATA3, TTF-1, CDX-2 and GRHL2 bound DNA was measured by PCR.

實施例13Example 13

我們使用標準方法將與組蛋白H3結合的單株抗體塗覆在磁珠(MyOne TosylActivated DynabeadsTM)上。簡而言之,將單株抗體與磁珠(40μg 抗體/mg的珠子)培養在含有1M硫酸銨的0.1M Borate Buffer pH9.5中,在37°C下培養在的滾動瓶中18 小時以保持珠子的懸浮。沉澱珠子並傾析上清液。將珠子重新懸浮,並且培養在含有0.1% Tween 20 和1%牛血清白蛋白(BSA)的磷酸鹽緩衝鹽水pH7.4(PBS)的封閉緩衝液中,在37°C下培養1小時。然後沉澱珠子,用含有0.1% Tween 20和1% BSA的PBS洗滌兩次,並儲存在含有0.1% Tween 20、1% BSA和防腐劑的PBS中。We coated histone H3-binding monoclonal antibodies on magnetic beads (MyOne TosylActivated DynabeadsTM) using standard methods. Briefly, monoclonal antibodies were incubated with magnetic beads (40 μg antibody/mg beads) in 0.1M Borate Buffer pH 9.5 containing 1M ammonium sulfate in roller bottles at 37°C for 18 hours to Keep the beads in suspension. Pellet the beads and decant the supernatant. Beads were resuspended and incubated in blocking buffer in phosphate-buffered saline pH 7.4 (PBS) containing 0.1% Tween 20 and 1% bovine serum albumin (BSA) for 1 hour at 37°C. Beads were then pelleted, washed twice with PBS containing 0.1% Tween 20 and 1% BSA, and stored in PBS containing 0.1% Tween 20, 1% BSA and preservatives.

將收集自診斷為CRC的患者的EDTA血漿樣品(2.5mL)與磁珠(0.15mL, 10mg/ml)在室溫下在試管中培養1小時,使其滾動以保持顆粒懸浮。磁性顆粒被沉澱並去除。保留剩餘的核小體耗盡的樣品。EDTA plasma samples (2.5 mL) collected from patients diagnosed with CRC were incubated with magnetic beads (0.15 mL, 10 mg/ml) in tubes for 1 hour at room temperature, rolling to keep particles in suspension. Magnetic particles are precipitated and removed. Keep the remaining nucleosome-depleted samples.

然後,根據製造商的說明書,使用市售的DNA萃取試劑盒(Qiagen QIAamp DSP circulating NA kit),萃取核小體耗盡的樣品以及原始未處理的血漿樣品中的DNA。DNA was then extracted from nucleosome-depleted samples as well as from raw unprocessed plasma samples using a commercially available DNA extraction kit (Qiagen QIAamp DSP circulating NA kit) according to the manufacturer's instructions.

根據製造商的說明書,使用市售試劑盒(Claret Bio SRSLY NGS Library Prep Kit),擴增萃取的cfDNA以產生用於定序的單股文庫。Extracted cfDNA was amplified to generate single-stranded libraries for sequencing using a commercially available kit (Claret Bio SRSLY NGS Library Prep Kit) according to the manufacturer's instructions.

透過Next Generation Illumina NovaSeq定序對擴增的cfDNA文庫進行定序。The amplified cfDNA library was sequenced by Next Generation Illumina NovaSeq sequencing.

使用Illumina DRAGEN Bioinformatic管道(https://emea.illumina.com/products/by-type/informatics-products/dragen-bio-it-platform.html),將每個代表cfDNA片段的定序讀數與人類參考基因組GRCh38/hg38進行比對。得到的比對BAM文件係用於創建不同片段大小(35-80bp、135-155bp和156-180bp)的子集,其係使用Sequence Alignment/Map SAMtools (Li et al, 2009)。使用1bp的分箱(bin)大小(可能的最高解析率)計算讀取覆蓋度(發現到的覆蓋特定基因位點的片段數)。使用deepTools bamCoverage將讀取覆蓋度標準化為使用RPGC(每個基因組覆蓋的讀取)映射到人類基因組的讀取總數。 Using the Illumina DRAGEN Bioinformatics pipeline (https://emea.illumina.com/products/by-type/informatics-products/dragen-bio-it-platform.html), each sequenced read representing a cfDNA fragment was aligned with a human reference The genome GRCh38/hg38 was compared. The resulting aligned BAM files were used to create subsets of different fragment sizes (35-80bp, 135-155bp, and 156-180bp) using Sequence Alignment/Map SAMtools (Li et al , 2009). Read coverage (number of fragments found covering a specific locus) was calculated using a bin size of 1 bp (highest resolution possible). Read coverage was normalized to the total number of reads mapped to the human genome using RPGC (reads per genome coverage) using deepTools bamCoverage.

CTCF經常被用作模型轉錄因子,因為其被9780個已知和已公開的CTCF TFBS序列的優異地特徵化(Kelly et al, 2012)。 35-80bp cfDNA短片段在9780個已公開的CTCF結合位點的基因座的覆蓋度的結果,是與預期的與CTCF相連的DNA片段的大小一致,與較長的 cfDNA片段的覆蓋度相比,與循環單核小體結合的預期大小一致(135-155bp和156-180bp),如圖5(a)所示。顯示覆蓋度在5000bp範圍內,包含CTCF結合位點位置的上游和下游的 2500個鹼基。我們在Kelly et al, 2012報導的CTCF TFBS基因座的基因組位置上觀察到小35-80bp cfDNA片段結合的強覆蓋峰。因為定序的文庫是在去除核小體後從cfDNA產生的,所以cfDNA文庫包含少量核小體,且核小體定位訊號低。基因組中在CTCF TFBS基因座處 的35-80bp cfDNA片段覆蓋峰的振幅(在圖5(a)中約為5)遠大於週期性核小體定位峰的振幅(約為0.25)。這種低背景特性可產生增強的35-80bp訊號。 CTCF is often used as a model transcription factor because it is excellently characterized by the 9780 known and published CTCF TFBS sequences (Kelly et al , 2012). Coverage results for short cfDNA fragments of 35-80 bp at the locus of 9,780 published CTCF-binding sites are consistent with expected CTCF-associated DNA fragment sizes compared to coverage of longer cfDNA fragments , consistent with the expected size (135–155 bp and 156–180 bp) bound to circulating mononucleosomes, as shown in Fig. 5(a). Coverage is shown to be within 5000 bp, encompassing 2500 bases upstream and downstream of the CTCF binding site position. We observed strong coverage peaks bound by small 35-80bp cfDNA fragments at the genomic location of the CTCF TFBS locus reported by Kelly et al , 2012. Because the sequenced library was generated from cfDNA after nucleosome removal, the cfDNA library contained few nucleosomes and had low nucleosome localization signal. The amplitude of the 35–80 bp cfDNA fragment coverage peak in the genome at the CTCF TFBS locus (approximately 5 in Fig. 5(a)) was much larger than that of the periodic nucleosome positioning peak (approximately 0.25). This low background property produces an enhanced 35-80bp signal.

相比之下,獲自未經處理以去除核小體的同一樣品中的cfDNA文庫在基因組中CTCF TFBS基因座處的35-80bp cfDNA片段覆蓋峰顯示出較小的振幅峰(圖5(b)中約為0.7),其與週期性核小體定位峰的振幅相似(約 0.25)。這表示本發明的方法成功地從液體活檢方法中去除核小體相關的背景cfDNA訊號,從而為片段組學cfDNA分析方法和其他cfDNA分析方法提供了改善的敏感度。In contrast, the 35–80 bp cfDNA fragment coverage peak at the CTCF TFBS locus in the genome from a cfDNA library obtained from the same sample that had not been treated to remove nucleosomes showed a smaller amplitude peak (Fig. 5(b ), which is about 0.7 in ), which is similar to the amplitude of the periodic nucleosome positioning peak (about 0.25). This indicates that the method of the present invention successfully removes nucleosome-associated background cfDNA signals from liquid biopsy methods, thereby providing improved sensitivity for fragmentomics cfDNA analysis methods and other cfDNA analysis methods.

然後,我們重複了1041個CTCF TFBS 的分析,已知這些TFBS在永生化癌細胞中被選擇性佔據(Liu et al, 2017),而不是在健康細胞中。圖6(a)所示的結果顯示,與1041個癌症特異性CTCF TFBS序列結合的35-80bp cfDNA片段存在明顯的片段覆蓋峰,具有低背景核小體週期性訊號。這指出了癌症特異性TFBS基因座的CTCF佔據,因此也指出了這些cfDNA片段的腫瘤細胞來源。同樣,從未經處理以去除核小體的同一樣品中獲得的cfDNA 文庫在1041個CTCF TFBS基因座處顯示出35-80bp cfDNA片段覆蓋峰是不太清晰和更小的振幅峰(圖6(b))。 We then repeated the analysis for 1041 CTCF TFBS, which are known to be selectively occupied in immortalized cancer cells (Liu et al , 2017), but not in healthy cells. The results shown in Figure 6(a) show that there are clear fragment coverage peaks for 35-80 bp cfDNA fragments bound to 1041 cancer-specific CTCF TFBS sequences, with low background nucleosome periodicity signal. This points to CTCF occupancy of cancer-specific TFBS loci and thus the tumor cell origin of these cfDNA fragments. Likewise, a cfDNA library obtained from the same sample that had not been treated to remove nucleosomes showed 35–80 bp cfDNA fragment coverage peaks at 1041 CTCF TFBS loci that were less clear and smaller amplitude peaks (Fig. 6( b)).

透過ChIP-Seq證明體液中CTCF相連的cfDNA片段與癌症特異性TFBS基因座結合,這是所研究的主體中存在癌症疾病的指標,並且可以以這種方式用作為生物標記物。我們得出結論,本發明的方法成功地將血漿中與疾病相關的TFBS鑑定為疾病的生物標記物。The demonstration by ChIP-Seq of CTCF-linked cfDNA fragments in body fluids binding to cancer-specific TFBS loci is an indicator of the presence of cancer disease in the subjects studied and can be used in this way as a biomarker. We conclude that the method of the present invention successfully identifies disease-associated TFBS in plasma as a biomarker of disease.

參考文獻 Active Motif, Nat. Methods 3: 658 (2006), doi:10.1038/NMETH907 Bohinski et al. Molecular and Cellular Biology, 14(9): 5671 (1994) Corces et al. Science, 362(6413): eaav1898 (2018), doi:10.1126/science.aav1898. Crowley et al. Nat. Rev. Clin. Oncol. 10: 472-484 (2013), doi:10.1038/nrclinonc.2013.110 Darnell, Nat. Rev. Cancer 2: 740-749 (2002), doi:10.1038/nrc906 Deligezer et al. Clinical Chemistry 54:7 1125–1131 (2008) Dunbar, Clinica Chimica Acta 363 (1-2) : 71-82 (2006), doi.org/10.1016/j.cccn.2005.06.023 Gurel et al. Am J Surg Pathol, 34(8):1097-105 (2010), doi:10.1097/PAS.0b013e3181e6cbf3. Heinz et al. Mol. Cell 38(4): 576-89 (2010), doi: 10.1016/j.molcel.2010.05.004. Holdenrieder & Stieber, Crit. Rev. Clin. Lab. Sci. 46(1):1-24 (2009), doi:10.1080/10408360802485875 Hu et al. J. Trans. Med. 17: 124 (2019), doi:10.1186/s12967-019-1871-x Jung et al. Clin. Chim. Acta 411(21-22): 1611-24 (2010), doi:10.1016/j.cca.2010.07.032 Kelly et al. Genome Res. 22: 2497-2506 (2012), doi:10.1101/gr.143008.112. Klenova et al. Nucleic Acids Res. 25(3): 466–473 (1997), doi.org/10.1093/nar/25.3.466 Lambert et al. Cell 172(4):650-665 (2018), doi:10.1016/j.cell.2018.01.029 Latil et al. Cell Stem Cell 20(2): 191-204.e5 (2017), doi:10.1016/j.stem.2016.10.018. Lee et al. J. Mol. Med. (Berl). 85(12):1393-404 (2007), doi: 10.1007/s00109-007-0237-7 Li et al. Bioinformatics 25(16): 2078–2079 (2009), doi: 10.1093/bioinformatics/btp352 Lin et al. PLoS Genet. 3(6):e87 (2007), doi:10.1371/journal.pgen.0030087.eor Liu et al. Oncotarget 8(69): 114183-114194 (2017), doi: 10.18632/oncotarget.23172 Liu et al. EBioMedicine 41: 345-356 (2019), doi:10.1016/j.ebiom.2019.02.010 Maenhaut et al. 2015 In: Feingold, Anawalt, Boyce, et al., editors. Endotext. https://www.ncbi.nlm.nih.gov/books/NBK285554/ Mann et al. Curr. Top Dev. Biol. 88: 63-101 (2009), doi:10.1016/S0070-2153(09)88003-4. Mansson et al. Mol. Oncol. 15(11): 2868-2876 (2021), doi:10.1002/1878-0261.13093 Matys et al. Nucleic Acids Res. 34: D108–D110 (2006), doi:10.1093/nar/gkj143 Merabet and Mann, Trends Genet. 32(6): 334-347 (2016), doi:10.1016/j.tig.2016.03.004. Newman et al. Nat. Med. 20(5): 548-54 (2014), doi:10.1038/nm.3519 Park et al. Oncol. Lett. 3(4): 921-926 (2012), doi: 10.3892/ol.2012.592 Pomerantz et al. Nat. Genet. 47(11): 1346-51 (2015), doi:10.1038/ng.3419. Poorey et al. Science 342(6156): 369-72 (2013), doi:10.1126/science.1242369. Ramírez et al. Nucleic Acids Res. 44(W1): W160-5 (2016), doi: 10.1093/nar/gkw257 Ralston, Do transcription factors actually bind DNA? DNA footprinting and gel shift assays. Nature Education 1(1): 121 (2008) Sadeh et al. Nat. Biotechnol. 39: 586–598 (2021), doi.org/10.1038/s41587-020-00775-6 Sanchez et al. NPJ Genom. Med. 3: 31 (2018), doi:10.1038/s41525-018-0069-0 Skene and Henikoff, eLife 6:e21856 (2017), doi:10.7554/eLife.21856.002 Snyder et al. Cell 164(1-2): 57-68 (2016), doi:10.1016/j.cell.2015.11.050 Ulz et al. Nat. Commun. 10(1): 4666 (2019), doi:10.1038/s41467-019-12714-4 Vad-Nielsen et al. Lung Cancer 147 : P244-251 (2020), doi.org/10.1016/j.lungcan.2020.07.023 Vaquerizas et al. Nat. Rev. Genet. 10(4): 252-63 (2009), doi:10.1038/nrg2538 Wang et al. Genome Res. 22(9): 1680-8 (2012), doi: 10.1101/gr.136101.111 Zhang et al. Genome Biol. 9(9): R137 (2008), doi: 10.1186/gb-2008-9-9-r137 Zhou et al. BMC Genomics 18(1):724 (2017), doi:10.1186/s12864-017-4115-6 References Active Motif, Nat. Methods 3: 658 (2006), doi:10.1038/NMETH907 Bohinski et al . Molecular and Cellular Biology, 14(9): 5671 (1994) Corces et al . Science, 362(6413): eaav1898 (2018), doi:10.1126/science.aav1898. Crowley et al . Nat. Rev. Clin. Oncol. 10: 472-484 (2013), doi:10.1038/nrclinonc.2013.110 Darnell, Nat. Rev. Cancer 2: 740 -749 (2002), doi:10.1038/nrc906 Deligezer et al . Clinical Chemistry 54:7 1125–1131 (2008) Dunbar, Clinica Chimica Acta 363 (1-2): 71-82 (2006), doi.org/10.1016 /j.cccn.2005.06.023 Gurel et al . Am J Surg Pathol, 34(8):1097-105 (2010), doi:10.1097/PAS.0b013e3181e6cbf3. Heinz et al . Mol. Cell 38(4): 576 Hu et al . J. Trans. Med. 17: 124 (2019), doi:10.1186/s12967-019-1871-x Jung et al . Clin. Chim. Acta 411 (21-22): 1611-24 (2010), doi:10.1016/j.cca.2010.07.032 Kelly et al . Genome Res. 22: 2497-2506 (2012), doi :10.1101/gr.143008.112. Klenova et al . Nucleic Acids Res. 25(3): 466–473 (1997), doi.org/10.1093/nar/25.3.466 Lambert et al . Cell 172(4):650- 665 (2018), doi:10.1016/j.cell.2018.01.029 Latil et al . Cell Stem Cell 20(2): 191-204.e5 (2017), doi:10.1016/j.stem.2016.10.018. Lee et al . J. Mol. Med. (Berl). 85(12):1393-404 (2007), doi: 10.1007/s00109-007-0237-7 Li et al . Bioinformatics 25(16): 2078–2079 ( 2009), doi: 10.1093/bioinformatics/btp352 Lin et al . PLoS Genet. 3(6):e87 (2007), doi: 10.1371/journal.pgen.0030087.eor Liu et al . Oncotarget 8(69): 114183- 114194 (2017), doi: 10.18632/oncotarget.23172 Liu et al . EBioMedicine 41: 345-356 (2019), doi: 10.1016/j.ebiom.2019.02.010 Maenhaut et al . 2015 In: Alcet, Feingold, Anaw et al ., editors. Endotext. https://www.ncbi.nlm.nih.gov/books/NBK285554/ Mann et al . Curr. Top Dev. Biol. 88: 63-101 (2009), doi:10.1016/ S0070-2153(09)88003-4. Mansson et al . Mol. Oncol. 15(11): 2868-2876 (2021), doi:10.1002/1878-0261.13093 Matys et al . Nucl eic Acids Res. 34: D108–D110 (2006), doi:10.1093/nar/gkj143 Merabet and Mann, Trends Genet. 32(6): 334-347 (2016), doi:10.1016/j.tig.2016.03.004 . Newman et al . Nat. Med. 20(5): 548-54 (2014), doi:10.1038/nm.3519 Park et al . Oncol. Lett. 3(4): 921-926 (2012), doi: 10.3892/ol.2012.592 Pomerantz et al . Nat. Genet. 47(11): 1346-51 (2015), doi:10.1038/ng.3419. Poorey et al . Science 342(6156): 369-72 (2013), doi:10.1126/science.1242369. Ramírez et al . Nucleic Acids Res. 44(W1): W160-5 (2016), doi: 10.1093/nar/gkw257 Ralston, Do transcription factors actually bind DNA? DNA footprinting and gels shift assay . Nature Education 1(1): 121 (2008) Sadeh et al . Nat. Biotechnol. 39: 586–598 (2021), doi.org/10.1038/s41587-020-00775-6 Sanchez et al . NPJ Genom. Med . 3: 31 (2018), doi:10.1038/s41525-018-0069-0 Skene and Henikoff, eLife 6:e21856 (2017), doi:10.7554/eLife.21856.002 Snyder et al . Cell 164 (1-2): 57-68 (2016), doi:10.1016/j.cell.2015.11.050 Ulz et al . Nat. Commun. 10(1): 4666 (2019), doi:10.1038/s41467-019-12714-4 Vad-Nielsen et al . Lung Cancer 147 : P244-251 (2020), doi.org/10.1016/j.lungcan.2020.07.023 Vaquerizas et al . Nat. Rev. Genet . 10(4): 252-63 (2009), doi:10.1038/nrg2538 Wang et al . Genome Res. 22(9): 1680-8 (2012), doi: 10.1101/gr.136101.111 Zhang et al . Genome Biol . 9(9): R137 (2008), doi: 10.1186/gb-2008-9-9-r137 Zhou et al . BMC Genomics 18(1):724 (2017), doi: 10.1186/s12864-017-4115- 6

none

圖1:各種轉錄因子在表面張力蛋白B (surfactant protein B)、甲狀腺球蛋白(甲狀腺球蛋白)、甲狀腺過氧化物酶(thyroperoxidase)和促甲狀腺激素受體(thyrotropin receptor,TSH受體)基因的啟動子位點共結合的卡通圖。CRE:環形單磷酸腺苷反應單元(cyclic adenosine monophosphate response element);GABP:GA結合蛋白 (GA-binding protein);HNF-3:肝細胞核因子3 (Hepatocyte nuclear factor 3);NF-1:核因子1 (Nuclear factor 1);PAX-8:配對盒基因 8 (Paired box gene 8);Runx2:Runt相關轉錄因子2 (Runt-related transcription factor 2);TRα/RXR二聚體:甲狀腺激素受體α/類視色素X受體二聚體 (TRα/RXR dimer: Thyroid hormone receptor α/Retinoid X receptor dimer);TTF-1:甲狀腺轉錄因子1(也稱為NK2同源框1,NKX2-1);TTF-2:甲狀腺轉錄因子2。 圖2:轉錄複合物的DNA環結構示例的卡通圖,用於說明轉錄複合物中涉及的一些各種調控蛋白的共結合,包括但不限於:一般轉錄因子 (general transcription factor,GTF),基因特異性轉錄因子 (TF)、輔因子、活化子、抑制子、中介因子、DNA彎曲蛋白和RNA聚合酶。調控蛋白與位於基因附近的調控DNA序列以及遠離基因的調控序列結合,包括啟動子序列、TATA盒序列、增強子序列和抑制子序列。其他調控蛋白(例如染色質重塑蛋白)以及其他調控序列是可能的。 圖3:吸附在磁珠上的重組單核小體的西方墨點法分析,該磁珠塗有與組蛋白H3 結合的抗體。結果證明了透過本發明的方法的單核小體的劑量依賴性吸附。 圖4:人類血漿樣品和重組單核小體溶液的核小體ELISA結果,其是在核小體使用未塗佈的磁珠或用與組蛋白H3結合的抗體塗佈的磁珠進行免疫沉澱後。結果表示,溶液中天然存在的人類循環核小體和重組核小體均不受未塗佈磁珠的影響,但藉由使用塗有與組蛋白H3結合的抗體的磁珠的免疫沉澱定量地去除。 圖5:透過短cfDNA片段(35-80bp)或更大的cfDNA片段(135-155bp或156-180bp)對 9780個已公開的CTCF TFBS基因座的標準化覆蓋度。(a)cfDNA序列文庫覆蓋CTCF TFBS基因座的覆蓋度,該文庫是透過本發明的方法從核小體耗盡的CRC患者收集的血漿樣品中獲得的。(b)相同樣品中沒有核小體耗盡的覆蓋度。 圖6:透過短cfDNA片段(35-80bp)或更大的cfDNA片段(135-155bp或156-180bp),CTCF在癌細胞而非正常細胞中佔據的1041個已公開的CTCF TFBS基因座的標準化覆蓋度。(a)cfDNA序列文庫覆蓋癌症相關的CTCF TFBS基因座的覆蓋度,該文庫是透過本發明的方法從核小體耗盡的CRC患者收集的血漿樣品中獲得的。(b)相同樣品中沒有核小體耗盡的覆蓋度。 Figure 1: Expression of various transcription factors in the genes for surface tension protein B (surfactant protein B), thyroglobulin (thyroglobulin), thyroid peroxidase (thyroperoxidase), and thyrotropin receptor (TSH receptor) Cartoon illustration of co-binding of promoter sites. CRE: cyclic adenosine monophosphate response element (cyclic adenosine monophosphate response element); GABP: GA-binding protein (GA-binding protein); HNF-3: hepatocyte nuclear factor 3 (Hepatocyte nuclear factor 3); NF-1: nuclear factor 1 (Nuclear factor 1); PAX-8: Paired box gene 8 (Paired box gene 8); Runx2: Runt-related transcription factor 2 (Runt-related transcription factor 2); TRα/RXR dimer: Thyroid hormone receptor α / Retinoid X receptor dimer (TRα/RXR dimer: Thyroid hormone receptor α/Retinoid X receptor dimer); TTF-1: Thyroid transcription factor 1 (also known as NK2 homeobox 1, NKX2-1); TTF-2: Thyroid transcription factor 2. Figure 2: Cartoon diagram of an example of a DNA loop structure of a transcription complex to illustrate the co-binding of some of the various regulatory proteins involved in the transcription complex, including but not limited to: general transcription factor (GTF), gene-specific Transcription factors (TFs), cofactors, activators, repressors, mediators, DNA bending proteins, and RNA polymerases. Regulatory proteins bind to regulatory DNA sequences located near the gene as well as regulatory sequences remote from the gene, including promoter sequences, TATA box sequences, enhancer sequences, and repressor sequences. Other regulatory proteins, such as chromatin remodeling proteins, as well as other regulatory sequences are possible. Figure 3: Western blot analysis of recombinant mononucleosomes adsorbed to magnetic beads coated with an antibody that binds to histone H3. The results demonstrate dose-dependent adsorption of mononucleosomes by the method of the invention. Figure 4: Nucleosome ELISA results from human plasma samples and recombinant mononucleosome solutions, which were immunoprecipitated from nucleosomes using uncoated magnetic beads or magnetic beads coated with an antibody that binds to histone H3 Rear. The results showed that neither circulating human nucleosomes nor recombinant nucleosomes naturally occurring in solution were affected by uncoated beads, but were quantified by immunoprecipitation using magnetic beads coated with an antibody that binds histone H3. remove. Figure 5: Normalized coverage of 9780 published CTCF TFBS loci by short cfDNA fragments (35-80bp) or larger cfDNA fragments (135-155bp or 156-180bp). (a) Coverage of the CTCF TFBS locus by the cfDNA sequence library obtained by the method of the present invention from plasma samples collected from nucleosome-depleted CRC patients. (b) Coverage without nucleosome depletion in the same samples. Figure 6: Normalization of 1041 published CTCF TFBS loci occupied by CTCF in cancer cells but not normal cells by short cfDNA fragments (35-80bp) or larger cfDNA fragments (135-155bp or 156-180bp) Coverage. (a) Coverage of cancer-associated CTCF TFBS loci in a cfDNA sequence library obtained by the method of the present invention from plasma samples collected from nucleosome-depleted CRC patients. (b) Coverage without nucleosome depletion in the same samples.

none

Figure 12_A0101_SEQ_0001
Figure 12_A0101_SEQ_0001

Figure 12_A0101_SEQ_0002
Figure 12_A0101_SEQ_0002

Claims (27)

一種檢測獲自一人類或動物主體的一體液樣品中的一無細胞DNA染色質片段的方法,該無細胞DNA染色質片段包含一轉錄因子結合位點序列的全部或一部分,選擇性地包含側翼序列,該方法包括步驟: (i)使該體液樣品接觸與核小體結合的一結合劑;和 (ii)分析來自在步驟(i)中未與該結合劑結合的體液樣品中的DNA。 A method of detecting a cell-free DNA chromatin fragment comprising all or a portion of a transcription factor binding site sequence, optionally including flanking sequence, the method includes the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes; and (ii) analyzing DNA from the body fluid sample not bound to the binding agent in step (i). 一種檢測獲自一人類或動物主體的一體液樣品中的無細胞DNA染色質片段化模式的方法,包括步驟: (i)使該體液樣品接觸與核小體結合的一結合劑;和 (ii)分析來自在步驟(i)中未與該結合劑結合的體液樣品的DNA。 A method of detecting chromatin fragmentation patterns in cell-free DNA in a body fluid sample obtained from a human or animal subject, comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to nucleosomes; and (ii) analyzing DNA from the body fluid sample not bound to the binding agent in step (i). 如請求項1或2所述之方法,其中,該結合劑結合至核小體核心表位。The method according to claim 1 or 2, wherein the binding agent binds to nucleosome core epitopes. 如請求項1或2所述之方法,其中,該結合劑結合至含有連接子DNA的核小體。The method according to claim 1 or 2, wherein the binding agent binds to nucleosomes containing linker DNA. 如請求項4所述之方法,其中,該結合劑是全部或一部分的組蛋白H1部分或一染色質結合蛋白。The method according to claim 4, wherein the binding agent is all or a part of histone H1 or a chromatin binding protein. 如請求項4所述之方法,其中,該結合劑結合至組蛋白H1或其組分。The method according to claim 4, wherein the binding agent binds to histone H1 or a component thereof. 如請求項1至6中任一項所述之方法,其中,該結合劑附著於一固相支持物。The method according to any one of claims 1 to 6, wherein the binding agent is attached to a solid support. 如請求項2至7中任一項所述之方法,其中,分析該DNA的一轉錄因子結合位點和/或側翼序列的存在。The method according to any one of claims 2 to 7, wherein the DNA is analyzed for the presence of a transcription factor binding site and/or flanking sequences. 如請求項1至8中任一項所述之方法,其中,透過PCR分析該DNA。The method according to any one of claims 1 to 8, wherein the DNA is analyzed by PCR. 如請求項1至9中任一項所述之方法,其中,該方法還包括使用該DNA的存在、含量、序列或片段化模式作為該主體疾病狀態的指標。The method according to any one of claims 1 to 9, wherein the method further comprises using the presence, content, sequence or fragmentation pattern of the DNA as an indicator of the subject's disease state. 一種檢測從一人類或動物主體獲得的一體液樣品中的無細胞DNA染色質片段化模式的方法,包括步驟: (i)使該體液樣品接觸與一含有連接子DNA的核小體結合的一結合劑;和 (ii)分析來自在步驟(i)中未與該結合劑結合的體液樣品中的DNA。 A method of detecting chromatin fragmentation patterns in cell-free DNA in a body fluid sample obtained from a human or animal subject, comprising the steps of: (i) contacting the bodily fluid sample with a binding agent that binds to a nucleosome containing linker DNA; and (ii) analyzing DNA from the body fluid sample not bound to the binding agent in step (i). 一種檢測一人類或動物主體疾病的方法,包括步驟: (i)使獲自該人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)分離步驟(i)中未與該結合劑結合的DNA; (iii)選擇性地擴增分離的DNA; (iv)確定DNA的序列;和 (v)使用在DNA中存在的一轉錄因子結合位點DNA序列和選擇性的側翼DNA序列作為一生物標記物以確定該主體中疾病的存在和/或性質。 A method of detecting disease in a human or animal subject, comprising the steps of: (i) contacting a sample of bodily fluid obtained from the human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) selectively amplifying the isolated DNA; (iv) determine the sequence of the DNA; and (v) using a transcription factor binding site DNA sequence and optionally flanking DNA sequences present in the DNA as a biomarker to determine the presence and/or nature of the disease in the subject. 一種檢測一人類或動物主體疾病的方法,包括步驟: (i)使獲自該人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)分離步驟(i)中未與該結合劑結合的DNA; (iii)選擇性地擴增分離的DNA; (iv)檢測DNA;和 (v)使用在步驟(iv)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式作為該主體中疾病的存在和/或性質的指標。 A method of detecting disease in a human or animal subject, comprising the steps of: (i) contacting a sample of bodily fluid obtained from the human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) selectively amplifying the isolated DNA; (iv) DNA testing; and (v) using the DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (iv) as an indicator of the presence and/or nature of the disease in the subject. 如請求項13所述之方法,其中,透過使用序列特異性引子的PCR方法進行擴增。The method according to claim 13, wherein the amplification is performed by a PCR method using sequence-specific primers. 一種檢測人類或動物主體疾病的方法,包括以下步驟: (i)使獲自該人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)分離步驟(i)中未與該結合劑結合的DNA; (iii)使用雜交法檢測分離的DNA;和 (iv)使用雜交的DNA的存在或數量作為主體中疾病的存在和/或性質的指標。 A method of detecting disease in a human or animal subject comprising the steps of: (i) contacting a sample of bodily fluid obtained from the human or animal subject with a binding agent that binds to nucleosomes; (ii) isolating DNA not bound to the binding agent in step (i); (iii) detection of isolated DNA using hybridization methods; and (iv) using the presence or amount of hybridized DNA as an indicator of the presence and/or nature of disease in a subject. 如請求項15所述之方法,其中,在雜交之前擴增該分離的DNA。The method of claim 15, wherein the isolated DNA is amplified before hybridization. 如請求項1至16中任一項所述之方法,其中,該體液樣品是一血液、血清或血漿樣品。The method according to any one of claims 1 to 16, wherein the body fluid sample is a blood, serum or plasma sample. 一種檢測或診斷一動物或人類主體疾病的方法,包括步驟: (i)從獲自該主體的一體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iii)使用在步驟(ii)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式來識別該主體的疾病狀態。 A method of detecting or diagnosing disease in an animal or human subject comprising the steps of: (i) removing nucleosomes from a sample of body fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; and (iii) using the DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (ii) to identify a disease state in the subject. 如請求項15至18中任一項所述之方法,其中,該疾病係選自癌症、自身免疫性疾病或炎性疾病。The method according to any one of claims 15 to 18, wherein the disease is selected from cancer, autoimmune disease or inflammatory disease. 一種評估一動物或人類主體是否適合進行醫學治療的方法,包括步驟: (i)從獲自該主體的一體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iii)使用在步驟(ii)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式作為替該主體選擇適合的治療的參數。 A method of assessing the suitability of an animal or human subject for medical treatment comprising the steps of: (i) removing nucleosomes from a sample of body fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; and (iii) using the DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (ii) as parameters for selecting an appropriate treatment for the subject. 一種用於監測一動物或人類主體治療的方法,包括步驟: (i)從獲自該主體的一體液樣品中去除核小體; (ii)檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA; (iii)在一或多種時機下,重複從獲自該主體的體液樣品中去除核小體後檢測、分析或測量剩餘樣品中與無細胞染色質片段相連的DNA;和 (iv)使用與步驟(ii)相比在步驟(iii)中檢測到的DNA含量和/或DNA序列和/或DNA片段化模式的任何變化作為該主體病況任何變化的參數。 A method for monitoring treatment of an animal or human subject comprising the steps of: (i) removing nucleosomes from a sample of body fluid obtained from the subject; (ii) detecting, analyzing or measuring DNA associated with cell-free chromatin fragments in the remaining sample; (iii) on one or more occasions, repeatedly detecting, analyzing, or measuring DNA associated with cell-free chromatin fragments in the remaining sample after removal of nucleosomes from a sample of bodily fluid obtained from the subject; and (iv) using any change in DNA content and/or DNA sequence and/or DNA fragmentation pattern detected in step (iii) compared to step (ii) as a parameter for any change in the subject's condition. 如請求項20或21所述之方法,其中,該治療係用於治療癌症、自身免疫性疾病或炎性疾病。The method of claim 20 or 21, wherein the treatment is for the treatment of cancer, autoimmune disease or inflammatory disease. 如請求項18至22中任一項所述之方法,其中,檢測或測量DNA含量和/或DNA序列作為一套組量測之一。The method of any one of claims 18 to 22, wherein DNA content and/or DNA sequence is detected or measured as one of a set of measurements. 一種用於檢測cfDNA片段序列的試劑盒,包括一核小體結合劑和用於與該cfDNA序列相連的DNA的擴增和/或定序和/或片段化模式,選擇性地還有在請求項1至23中任一項之方法中的試劑盒的使用說明書。A kit for detecting cfDNA fragment sequences, comprising a nucleosome binding agent and for amplification and/or sequencing and/or fragmentation patterns of DNA linked to the cfDNA sequences, optionally also on request Instructions for use of the kit in the method of any one of items 1 to 23. 一種治療所需主體中疾病的方法,其中,該方法包括以下步驟: (i)使獲自人類或動物主體的一體液樣品接觸與核小體結合的一結合劑; (ii)檢測或測量未與步驟(i)中的該結合劑結合的DNA片段; (iii)使用DNA片段的存在、序列、數量或片段化模式作為該主體中疾病存在的指標;和 (iv)如果在步驟(iii)中確定該主體患有疾病,則給予治療。 A method of treating a disease in a subject in need thereof, wherein the method comprises the steps of: (i) contacting a sample of bodily fluid obtained from a human or animal subject with a binding agent that binds to nucleosomes; (ii) detecting or measuring DNA fragments not bound to the binding agent in step (i); (iii) using the presence, sequence, amount, or pattern of fragmentation of DNA fragments as an indicator of the presence of disease in the subject; and (iv) administering treatment if the subject is determined to have the disease in step (iii). 如請求項25所述之方法,其中,所給予的治療係選自:手術、放射療法、化學療法、免疫療法、激素療法和生物療法。The method of claim 25, wherein the treatment administered is selected from the group consisting of surgery, radiation therapy, chemotherapy, immunotherapy, hormone therapy and biological therapy. 一種在獲自一懷孕的人類或動物主體的體液樣品中檢測胎兒疾病狀態的方法,包括步驟: (i)使母體體液樣品接觸與核小體結合的一結合劑; (ii)分析步驟(i)中未與該結合劑結合的DNA;和 (iii)使用該DNA的存在、數量、序列和/或片段化模式作為該主體的胎兒疾病狀態的指標。 A method of detecting a fetal disease state in a sample of bodily fluid obtained from a pregnant human or animal subject, comprising the steps of: (i) exposing the sample of maternal body fluid to a binding agent that binds to nucleosomes; (ii) analyzing DNA not bound to the binding agent in step (i); and (iii) using the presence, amount, sequence and/or fragmentation pattern of the DNA as an indicator of the subject's fetal disease state.
TW110149003A 2020-12-29 2021-12-28 Transcription factor binding site analysis of nucleosome depleted circulating cell free chromatin fragments TW202242145A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063131728P 2020-12-29 2020-12-29
US63/131,728 2020-12-29

Publications (1)

Publication Number Publication Date
TW202242145A true TW202242145A (en) 2022-11-01

Family

ID=79927180

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110149003A TW202242145A (en) 2020-12-29 2021-12-28 Transcription factor binding site analysis of nucleosome depleted circulating cell free chromatin fragments

Country Status (3)

Country Link
EP (1) EP4272001A1 (en)
TW (1) TW202242145A (en)
WO (1) WO2022144408A1 (en)

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0319376D0 (en) 2003-08-18 2003-09-17 Chroma Therapeutics Ltd Histone modification detection
GB201115095D0 (en) 2011-09-01 2011-10-19 Singapore Volition Pte Ltd Method for detecting nucleosomes containing nucleotides
GB201115098D0 (en) 2011-09-01 2011-10-19 Belgian Volition Sa Method for detecting nucleosomes containing histone variants
CA2855375C (en) 2011-12-07 2021-06-22 Singapore Volition Pte Limited Method for detecting nucleosome adducts
EP3172341A4 (en) * 2014-07-25 2018-03-28 University of Washington Methods of determining tissues and/or cell types giving rise to cell-free dna, and methods of identifying a disease or disorder using same
TWI730973B (en) 2015-07-23 2021-06-21 香港中文大學 Analysis of fragmentation patterns of cell-free dna
GB201518665D0 (en) * 2015-10-21 2015-12-02 Singapore Volition Pte Ltd Method for enrichment of cell free nucleosomes
GB201604806D0 (en) 2016-03-22 2016-05-04 Singapore Volition Pte Ltd Method of identifying a cancer of unknown origin
CN114901832A (en) 2019-08-27 2022-08-12 比利时意志有限责任公司 Method for isolating circulating nucleosomes

Also Published As

Publication number Publication date
WO2022144408A1 (en) 2022-07-07
EP4272001A1 (en) 2023-11-08

Similar Documents

Publication Publication Date Title
US11193939B2 (en) Method for detecting nucleosome adducts
US20220334128A1 (en) Method for the enrichment of circulating tumor dna
CN112119166A (en) Diagnostic use of cell-free DNA chromatin immunoprecipitation
CN114901832A (en) Method for isolating circulating nucleosomes
JP6777757B2 (en) Use of nucleosome-transcription factor complex for cancer detection
TW202242130A (en) Circulating transcription factor analysis
TW202242145A (en) Transcription factor binding site analysis of nucleosome depleted circulating cell free chromatin fragments
US20230133776A1 (en) Methods for diagnosing cancer
Rodriguez Estrogen Represses Target Genes through Epigenetic Modification of Proximal and Distal Elements